Computer science emphasizes two important topics: datastructures and algorithms. Those topics are important because the choices you make for a program's datastructures and algorithms affect that program's memory usage (for datastructures) and CPU time (for algorithms that interact with those datastructures). When choosing a datastructure or algorithm, you sometimes discover an inverse relationship between memory usage and CPU time: the less memory a datastructure uses, the more CPU time associated algorithms need to process the datastructure's data items, which are primitive type values or objects, via references. Also, the more memory a datastructure uses, the less CPU time associated algorithms need to process the data items-and faster algorithms result. This inverse relationship appears in Figure 1. Figure 1. The inverse relationship between memory usage and CPU time CPU Time Memory Usage
An example of the inere relationship betwen memary usage and CPU time involves the one-dimensional array and doubly-linked list datastructures, and their insertion/deletion algorithms. For any given list of data items, a one dimensional array occupies less memory than a doubly linked list: a doubly linked list needs to associate links with data items to find each data item's predecessor and successor, which requires extra memory. In contrast, a one- dimensional array's insertion/deletion algorithms are slower than a doubly linked list's equivalent algorithms: inserting a data item into or deleting a data item from a one-dimensional array requires data item movement to expose an empty element for insertion or close an element made empty by deletion. (lI explore one-dimensional arrays later in this article, and doubly linked lists in next month's article.) This article initiates a two-part series that explores datastructures and algorithms. The article begins with a presentation of basic concepts and continues with a tour of the array datastructure. You learn about one- dimensional, two-dimensional, and ragged arrays, plus linear-search, bubble- sort, binary-search, and matri-multiplication array-oriented algorithms. The article ends by asserting that Java's arrays are objects.
Datastructure and algorithm basics Before we explore specific datastructures and algorithms, we need to examine three basic questions: What is a datastructure? What is an algorithm? How do you represent an algorithm? Knowledge of those concepts helps you understand this series. What is a datastructure? Datastructures have been around since the structured programming era. A definition from that era: a datastructure is a set of types, a designated type from that type set, a set of functions, and a set of axioms. That definition implies that a datastructure is a type with implementation. In our object-oriented programming era, type with implementation means class. The datastructure is a class definition is too broad because it embraces Employee, Vehicle, Account, and many other real-world entity-specific classes as datastructures. Although those classes structure various data items, they do so to describe real-world entities (in the form of objects) instead of describing container objects for other entity (and possibly container) objects. This containment idea leads to a more appropriate datastructure definition: a
capabilities for storing and retrieving data items. Examples of container datastructures: arrays, linked lists, stacks, and queues. (I will explore the linked- list, stack, and queue datastructures next month.) What is an algorithm? Algorithms often associate with datastructures. An algorithm is a sequence of instructions that accomplishes a task in a finite time period. The algorithm receives zero or more inputs, produces at least one output, consists of clear and unambiguous instructions, terminates after a finite number of steps, and is basic enough that a person can carry out the algorithm using a pencil and paper. In may never terminate without external intervention. Examples of algorithms associated with datastructures: linear-search, bubble-sort, binary-search, matrix multiplication, linked-list-concatenation, stack-push-and-pop, and queue-insert- and-delete. (l will explore the linked-list-concatenation, stack-push-and-pop, and queue-insert-and-delete algorithms next month.)
Algorithm representation source code. However, writing source code before you fully understand an algorithm often leads to diffiult to-find bug. One technique for overcoming those bugs involves flowcharts. A flowchart is a visual representation of an algorithm's control flow. That representation illustrates statements that need to execute, decisions that need to be made, logic flow (for iteration and other purposes), and terminals that indicate start and end points. To present that visual representation, a flowchart uses various symbols, which Figure 2 shows. Figure 2. Flowchart symbols for statements, decisions, logic flow and terminals Statement Decisionm
Logic flow (for iteration and other purp oses) Terininal What does a flowchart look like? Suppose you have a simple algorithm that initializes a counter to 0, reads characters until a new-line n) character is seen, increments the counter for each read digit character, and prints the counters value after the new-line character has been read. Figure 3's flowchart illustrates this algorithm's control flow Starc Setcmmt 0 Read ch
Read ch No No Yes Yes cmint- Print. cmnt Figure 3. A flowchart for counting digit characters. Click on thumbnail to view full-size image. flowchart's advantages include its simplicity and its ability to present an algorithm's control flow visually. Flowcharts also have disadvantages: . Highly-detailed flowcharts can introduce errors or inaccuracies. Extra time is required to position, label, and connect a flowchart's symbols, even though tools speed up this process. This delay might slow your understanding of an algorithmm . Because flowcharts are tools of the structured programming era, they
Because flowcharts are tools of the structured programming era, they arent as usetul in an object oriented content The Unifhed Modeling Language (UML) is more appropriate for providing object-oriented visual representations. An alternative to flowcharts is pseudocode: a textual representation of an algorithm that approximates the final source code. Pseudocode is useful for quickly writing down an algorithm's representation. Because syntax is not a concern, no rules define how to write pseudocode. Consider the following example DECLARE CHARACTER ch DECLARE INTEGER counte DO READ ch IF ch IS e THROUGH 9 THEN count++ END IF UNTIL ch IS n PRINT count END
Each element occupies the same number of bytes; the exact number depends on the type of the element's data item o All elements share the same type. We tend to think of an array's elements as occupying consecutive memory locations. When I discuss two-dimensional arrays, you'll discover that isn't always the case The number of indexes associated with any element is the array's dimension Note This section focuses exclusively on arrays of one and two dimensions because higher-dimensional arrays are not used as frequently One-dimensional arrays The simplest kind of array has one dimension: each element associates with a single index. Java provides three techniques for creating a one-dimensional
The simplest kind of array has one dimension: each element associates with a single index. Java provides three techniques for creating a one-dimensional e only keyword new initializer. Because l showed you how to create a one-dimensional array with only an initializer in "Non-Object-Oriented Language Basics, Part 1, I focus on the latter two techniques in this section. Use only keyword new to create a one-dimensional array. That technique requires either of the following syntaxes type '[..]. variable-name new' type [ integer_expression ' type ' variable_name ' 'new' type I integer_expression Either syntax: Specifies variable name as the name of the one-dimensional array variable . Specifies type as each element's type. Because the one-dimensional array variable holds a reference to the one-dimensional array, the variable's type
Specifies keyword new, followed by type, followed by integer_expression between square brackets (I ]), which identifies the number of elements. new allocates memory for a one-dimensional array's elements and zeroes all bits in each element's bytes, which means that each element contains a default value that you interpret based on type Specifies to assign the one-dimensional array's reference to variable name. Tip Java developers often place square bracket characters after type (int [] test_scores) rather than after variable_name (int test_scores [) when declaring an array variable. Keeping all type information in one place improves the source code's readability. The following code fragment use only keyword new to create a one-dimensional array that stores data items based on a primitive type: int [] test scores new int [4]; int [1 test_scores declares a one-dimensional array variable (test_scores)
int [1 test_scores declares a one-dimensional array variable (test_scores) along with that variable's type (int [1). The int [] reference type signifies that each element must contain a data item of primitive type integer. new int [4] creates a one-dimensional array by allocating memory for four consecutive integer elements. Each element holds a single integer and initializes to 0 Operator equal-to ) assigns the one-dimensional array's reference to test scores. Figure 4 illustrates the resulting one-dimensional array variable and elements. Figure 4. The test scores one-dimensional array reference variable contains a reference to a four-element one-dimensional array of integers. Click on thumbnail to view full-size image.
Primitive type one-dimensional arrays store data items that are primitive values. In contrast, reference type one-dimensional arrays store data items that reference objects. The following code fragment uses only keyword new to create a pair of one-dimensional arrays that each store data items based on a reference type: Clock [] c1 = new Clock [3]; Clock [ c2 new AlarmClock [3]; clock [] c1 - new Clock [3]; declares a one-dimensional array variable, (c1) of type clock [1, allocates memory for a one-dimensional clock array consisting of three consecutive elements, and assigns the clock array's reference to c1. Each element must contain a reference to either a clock object (assuming clock is a concrete class) or an object created from a concrete Clock subclass, and initializes to null. Clock [1 c2new AlarmClock [3]; resembles the previous declaration, except that a one-dimensional AlarmClock array is created, and its reference assigns to clock [1 variable c2. (Assume AlarmClock subclasses Clock.)
type variable_name '[ 'new type initializer ' type '' variable_name 'new' type ' initializer where initializer has the following syntax: initial_value 1 Either syntax Specifies variable_name as the name of the one-dimensional array variable . Specifies type as the type of each element. Because the one-dimensional array variable holds a reference to the one-dimensional array, the variable's type is type . Specifies keyword new, followed by type, followed by empty square brackets, followed by initializer. You do not specify the number of elements between the square brackets because the compiler determines