Objects are created from abstract data types that encapsulate and together
I’ve moved my blog. The new version of this post, which uses TypeScript rather than Flow, may be viewed here. Show
There are three main methods of representing data which developers are likely to encounter: abstract data types, algebraic data types, and objects. Abstract data types (frequently abbreviated “ADTs”) are likely familiar to developers with a computer science background, and algebraic data types (unfortunately also abbreviated “ADTs”) are likely familiar to developers with a functional programming background. Objects are a concept that most developers are extremely familiar with on a practical level, but many struggle to precisely define. These three approaches each have distinct tradeoffs, and familiarity with their characteristics can enable developers to choose the most advantageous approach for a given problem. This post will introduce and compare two of these, abstract data types and objects. I do the same for algebraic data types in the sequel to this post, here. Example code will be written in JavaScript using the Flow type system. This post is heavily based on what may be my favorite paper, “On Understanding Data Abstraction, Revisited” by William Cook. I have tried to pull out and present the concepts which are most likely to be relevant to the average JavaScript developer while omitting more abstract mathematical concerns, but for those who are motivated, I highly recommend reading the original paper and its companion essay. Abstract data types: representing data opaquelyAn abstract data type is a model for data consisting of values and operations, whose concrete structure is hidden. For example, a export interface NumberSetI { 0, export interface NumberSetI { 1, and export interface NumberSetI { 2. Each of these operations mutates or returns a Set value, but the means by which they do so, or by which a Set is represented, is hidden from consumers of the Set type. To be strictly correct, the word “model” is an operative part of this definition: an abstract data type is an idea, describing a data type that could be represented in different languages. Colloquially, it’s common to use the term “abstract data type” to refer both to the formal definition of the data type as well as to concrete implementations of it. I will use the colloquial definition, and refer to software components which hide their internal structure as abstract data types.There are several abstract data types built into the JavaScript language: export interface NumberSetI { 6, export interface NumberSetI { 7, and export interface NumberSetI { 8 are good examples. Developers have the ability to create and manipulate these, but are not able to access, modify, or extend their underlying representations. (Assuming we are limited by the constraints of good sense and decency and don’t try to modify the prototypes of things that we shouldn’t.) This leads to a fundamental tradeoff which characterizes abstract data types: by making the implementation of the data opaque, we have gained modularity at the expense of extensibility.Objects: representing data through composable interfacesCook’s definition of objects and object-oriented software discards many things which are normally considered essential to the paradigm, such as classes, mutable state, and inheritance. His definition instead centers on a total dedication to encapsulation. This is the definition that we will use here: a system is properly built out of objects if every function or method in the system only has access to the internals of a single abstraction. A method on the export interface NumberSetI { 9 class can only access the internal structure of export interface NumberSetI { 9; any interaction with another object must be done through that other object’s interface. Functions which do not exist as methods on a class cannot access the internals of any object, and can only interact with objects through those object’s published interface. Language features which allow you to break an object’s encapsulation, such as using export class Empty implements NumberSetI { 1 to figure out what concrete type of object is providing the interface being used, are completely disallowed.This definition is clearly based on established object-oriented principles: “maintain encapsulation” and “code to an interface, not an implementation” are well-known sayings among object-oriented programmers. This definition is much stricter, though, and a great deal of code which is often referred to as “object oriented” would be more accurately described as “imperative, using classes” under this definition. Cook calls the guiding principle of his object-oriented programming “autognosis”, meaning “only having knowledge of one’s self”. As we’ll see in the next section, implementations which follow autognosis have a very different set of costs and benefits than ones which do not. An abstract data type representation of a setTo make the nature of and distinction between abstract data types and objects more concrete, we’ll implement the same data structure in each. The structure in question will be a set holding numbers. The easiest implementation of this set as an abstract data type (though not the most efficient) is as a sorted array. This is what we’ll use here. We’ll provide five functions which operate on these sets: export class Empty implements NumberSetI { 2 will create an empty set, export interface NumberSetI { 0 will add an item to a set, export class Empty implements NumberSetI { 4 will tell us whether a set is empty, export interface NumberSetI { 2 will tell us whether a set contains a given item, and export class Empty implements NumberSetI { 6 will merge two sets together. Here is our implementation:export opaque type NumberSet = Array A few things to note about this implementation:
An object implementation of a setTo implement export class Empty implements NumberSetI { 8 with objects, we first need an interface for the objects that we’ll be working with. Here is this interface:export interface NumberSetI { This interface has four of the five functions which we implemented for our abstract data type. Rather than providing a way to get an empty set as a method on our interface, we’ll create a class for empty sets: export class Empty implements NumberSetI { The implementation of most of these methods is obvious. An empty set always returns class Insert implements NumberSetI { 9 when asked if it is empty, and add(n: number) { return new Insert(number, this); } 0 when asked if it contains a given item. The union of an empty set and another set is just this other set. Adding to an empty set is more puzzling, however. By definition, an empty set doesn’t hold anything, so add(n: number) { return new Insert(number, this); } 1 shouldn’t have any logic around holding items. We’ll address this by making a new class which is responsible for adding items to a set, which add(n: number) { return new Insert(number, this); } 1 can return for this purpose:class Insert implements NumberSetI { The add(n: number) { return new Insert(number, this); } 3 set is constructed by providing a number and another set, and is responsible for representing that set with the given number added to it. add(n: number) { return new Insert(number, this); } 3 only takes responsibility for this one number; if asked whether it contains a different number with export interface NumberSetI { 2, add(n: number) { return new Insert(number, this); } 3 will pass responsibility for determining this off to its contained set. We can use this class to complete our implementation of add(n: number) { return new Insert(number, this); } 1, with a trivial implementation of export interface NumberSetI { 0:add(n: number) { return new Insert(number, this); } We’ll implement the export class Empty implements NumberSetI { 6 method of add(n: number) { return new Insert(number, this); } 3 in a similar way. Taking the union of two sets is a very different responsibility than inserting a number into a set, so we’ll make a separate class for it:class Union implements NumberSetI { With this, we can implement export class Empty implements NumberSetI { 6 on add(n: number) { return new Insert(number, this); } 3 asunion(set: NumberSetI) { return new Union(this, set); } Compare our class Union implements NumberSetI { 3 object to the export class Empty implements NumberSetI { 6 implementation of our abstract data type. That implementation combined two sets by being aware of each of their internal structures. This one implements its export interface NumberSetI { 2 and export class Empty implements NumberSetI { 4 functions by delegating responsibility to each of the sets which it wraps. It interacts with these other sets through their public class Union implements NumberSetI { 7 interface, without knowing what concrete class is involved. This means that our class Union implements NumberSetI { 3 class respects autognosis.Just as with our abstract data type implementation, all of our data here is immutable. The export interface NumberSetI { 0 and export class Empty implements NumberSetI { 6 methods don’t modify the sets they’re called on, but rather return a new set. While people usually think of object-oriented programming as being based on mutability, objects can be very useful without any mutable state.Tradeoffs between abstract data types and objectsAbstract data types and objects have radically different characteristics, both in their implementations and in their usage in a larger system. Looking first at these two implementations in isolation, we can see that their performance characteristics are radically different. For our abstract data type, export class Empty implements NumberSetI { 4 was a constant time operation, and the runtime of export class Empty implements NumberSetI { 6, export interface NumberSetI { 0 and export interface NumberSetI { 2 were based on the number of items in the set. For our object implementation, export class Empty implements NumberSetI { 6 and export interface NumberSetI { 0 are constant time operations, while the runtime of export interface NumberSetI { 2 and export class Empty implements NumberSetI { 4 are based on the total number of methods that have been called on our set. In general, many performance optimizations are based on reducing some complex or redundant structure to an equivalent but smaller case. When we follow autognosis, this approach is not available, because we don’t get to inspect the concrete structure that we’ve created. For example, when we take the export class Empty implements NumberSetI { 6 of two sets in the abstract data type implementation we eliminate any duplicate entries. If we take the union of the same sets over and over, our set will not grow, and later calls to export interface NumberSetI { 2 will not be any slower. In our object implementation, class Union implements NumberSetI { 3 has no way to know whether the sets its combining overlap, and so can’t perform any kind of simplification. Calling export class Empty implements NumberSetI { 6 repeatedly on the same sets will result in more and more class Union implements NumberSetI { 3 nodes being added, and later calls to export interface NumberSetI { 2 on this set being slower.There is a justification for this cost: object implementations can be easily extended in ways that are impossible for abstract data types. The internal structure of an abstract data type determines and limits what data it can represent, but objects can represent anything for which we can define their interface methods. For instance, our abstract data type implementation has no way to represent infinite sets, but our object implementation can represent them easily: export class Everything implements NumberSetI { export class Everything implements NumberSetI {5 is the set which contains all numbers, while export class Everything implements NumberSetI {6 contains every even integer. Their implementations are trivial, and they can be used with all of our existing sets. If the sets we’ve written so far were published as a library, callers of the library would be free to implement their own sets, without needing anything from our library but the interface definition. The set of all prime numbers, the set of all numbers in a range, or any other set you’d like could be written and would interoperate with each other. Objects are clearly superior to abstract data types when it comes to adding new representations, but this doesn’t mean that they are more extensible in every way. Consider adding a export class Everything implements NumberSetI {7 function, which takes a number, and returns the smallest integer which is contained in the set that is larger than the provided number, or export class Everything implements NumberSetI {8 if no such integer exists. Adding this to the abstract data type requires adding one additional function: function smallestIntegerAbove( To add this method to our object implementation, though, we’d have to modify every class Union implements NumberSetI { 7 instance that we’ve implemented. add(n: number) { return new Insert(number, this); } 1, add(n: number) { return new Insert(number, this); } 3, class Union implements NumberSetI { 3, export class Everything implements NumberSetI {5 and function smallestIntegerAbove(4 will all have different implementations of export class Everything implements NumberSetI {7. Worse, imagine that we had published both of these set implementations on function smallestIntegerAbove(6. Adding a new function to the abstract data type implementation would be a minor change; our users could upgrade and just ignore the new function if they don’t want to use it. For the object implementation, extending the interface expected from objects would be a breaking change. Upgrading would require users to write additional code in order to stay compatible with other set implementations. Abstract data types are thus easily extensible in terms of operations, but not in terms of representations, while objects are easily extensible in terms of representations, but not in terms of operations. This tradeoff of operational extensibility versus representational extensibility is known as “the expression problem”. There are solutions to this problem which enable both kinds of extensibility for objects in a modular way, but they are much more involved than the easy representational extensibility which we have right now. Abstract data types and objects both provide data abstraction. This means that, from the perspective of a user consuming them, data is defined in terms of “what can be done with this data” rather than “what structure this data has.” For abstract data types, this abstraction is achieved by implementing a structure for data internally, and hiding this structure from users. For objects, no consistent structure exists, and data is represented as the composition of the behaviors which appear on the object’s interface. The individual objects which are combined to form a single data structure are as isolated from each other as the implementations of abstract data types are from the users who consume these types. Techniques which break the abstraction provided by object an object’s interface, such as the use of export class Empty implements NumberSetI { 1, are thus strictly forbidden.This fundamental distinction leads to a number of tradeoffs. Abstract data types are much easier to optimize than objects, but are impossible to extend without access to their internal implementation. Objects may be freely extended by creating additional representations which conform to the object’s published interface, even by users who are consuming the objects from a library. The use of concrete structures inside of abstract data types makes extending these structures with additional representations difficult, but a library author has a great deal of freedom to add new operations. In contrast, despite the ease with which objects may be extended representationally, adding new operations to them tends to require sweeping changes. The two concepts are complementary, and one’s strength tends to be the other’s weakness. When you need data abstraction, knowing their tradeoffs is immensely important for choosing the appropriate tool for a given situation. The capabilities of these two are vast, but they aren’t the only approaches to representing data. Algebraic data types are a third option, which represent data as a composable, concrete structure. I introduce them and contrast them with abstract data types and objects here. Why are classes in C++ called abstract data type?It does not specify how data will be organized in memory and what algorithms will be used for implementing the operations. It is called “abstract” because it gives an implementation-independent view. The process of providing only the essentials and hiding the details is known as abstraction.
What is an abstract data type quizlet?Abstract Data Types. A specification for a group of values and the operations on those values that is defined conceptually and independently of any programming language. Defining ADT in an interface. Make an interface with the abstract methods as the operations that can be done to that ADT.
What is the default access for members of the class abstract data type?With the class keyword, the default access level is private.
What is the difference between an ADT and a class in C ++?According to Code Complete, ADT is a collection of data and operations that work on that data. ADT form the foundation for the concept of classes. In languages that support classes, you can implement each abstract data type as its own class.
|