Data Structures — When to Use Which

Nil Seri
7 min readJan 21, 2022

--

Java Data Structures and How They Internally Work

Photo by Riccardo Bergamini on Unsplash

Class Hierarchies

https://www.javatpoint.com/collections-in-java
https://www.geeksforgeeks.org/linkedhashmap-class-in-java/

Array

  • Tuple (finite ordered list of elements)
  • Sequential
  • Linear Arrays — fixed size
    Dynamic Arrays — reserved space for additional elements. If full, it copies its content to a larger array
- Optimal for indexing
- Bad at searching, inserting, deleting (excluding the end)

Complexity:

indexing = O(1)
search = O(n)
optimized search = O(logn)
insertion = (for dynamic) O(n)

ArrayList (Java)

  • implements List
  • Internally uses an Object[] of default size 10 (if not declared).
  • When you add an item, it checks if there is any space left for the new element.
    If space is not a problem, the new item is added at the next empty space.
    If not; a larger array of 50% (using right shift operator to calculate) more the initial size is created and the current array is copied to the new one (using Arrays.copyOf).
  • When you remove an element, elements are shifted (using Arrays.copyOf).
- add
- addAll
- clear
- clone
- remove
- subList
- toArray

Complexity:

append / get : O(1)
add / remove / indexOf / contains : O(n)

LinkedList

  • Chain of nodes
  • A node holds data of its own and a reference to its next node.
  • Doubly Linked List — reference to previous and next nodes
    Circularly Linked List — head&tail linked
    Stack — LIFO, most commonly with LinkedLists (head is the only place for insertion and removal) but also with Arrays
    Queue — FIFO, implemented with LinkedLists (a doubly linked list that only adds to tail and removes from head) or Arrays
- Optimized for insertion/deletion
- Slow at indexing/searching

Complexity:

indexing = O(n)
search = O(n)
optimized search = O(n)
append = O(1)
prepend = O(1)
insertion = O(n)

Stack (Java)

  • implements List and extends Vector (Vector class is thread-safe)
- empty
- peek
- pop
- push
- search
https://www.scientecheasy.com/2020/09/vector-in-java.html/

PriorityQueue (Java)

  • implements Queue
  • When new elements are inserted into the PriorityQueue, they are ordered (and retrieved later) based on their natural ordering or by a defined “Comparator” provided when we construct the PriorityQueue.
  • The internal working of the PriorityQueue is based on the Binary Heap.
  • not thread-safe
- offer
- poll
- peek

Complexity:

- enque / deque : O(log(n))
- retrieval : O(1)
- contains: O(n)

LinkedList (Java)

  • implements List and Deque interfaces
  • List implementation (doubly linked list)
  • null elements are allowed.
  • not good at iteration; best at removing the current element during the iteration
  • There is a static “Node” class.
  • LinkedList class holds “first” and “last” variables.
  • When you add the very first item, both the “first” and “last” point to the new “Node”. They get updated according to the operation type.
- add
- addFirst
- addLast
- remove
- removeFirst
- removeLast

Complexity:

- append : O(1)
- add / get / remove / contains : O(n)

ArrayDeque (Java)

  • implements Deque — supports insertion, removal and retrieval of elements at both ends.
  • Resizeable array implementation
  • Automatically doubles the size of an array when head and tail pointer meets each other while adding an element
  • null elements are not allowed.
  • Not thread-safe
  • ArrayDeque implementation consumes less memory than LinkedList implementation.
  • More efficient than the LinkedList for add and remove operation at both ends.
  • Works significantly faster than the synchronized Stack.
  • Is a faster queue than LinkedList due to the better locality of reference.
- addFirst
- addLast
- removeFirst
- removeLast
- getFirst
- getLast

Complexity:

Most ArrayDeque operations run in amortized constant time (O(1)).

When to use which:

- Use an ArrayList if you need to access elements by index and you only need to insert/delete at the end. 
- Use an ArrayDeque as a stack, queue, or deque.
- Use a LinkedList if you need to insert/delete while iterating through the list, or if you need insertion at the ends to be worst-case O(1).

Map

  • Hash Table or Hash Map
  • data as key-value pairs
  • hashing (a key and its unique output — beware of hash collisions)
  • associative arrays, database indexing
- designed to optimize searching, insertion, deletion 

Complexity:

indexing = O(1)
search = O(1)
insertion = O(1)

HashMap (Java)

  • extends AbstractMap
  • contains an array of “Node” which has “hash”, “key”, “value”, “next” (points to the next node in the same bucket of array table)
  • Hashing = process of converting an object into integer form by using the method “hashCode”.
  • A bucket is one element of HashMap array, used to store nodes.
    A single bucket can have more than one node (depending on hashCode); using link list to connect the nodes.
  • Buckets are different in capacity.
  • capacity = number of buckets * load factor
  • index = hashCode(key) & (n-1)
  • allows null key (with hashCode value = 0), multiple null values
  • initial size = 16 with a load factor of 0.75 (75% of the capacity)
    load factor = a tuning parameter; the measure that decides when to increase the capacity of the Map.
  • The threshold of a HashMap is approximately = current capacity * load factor.
  • When the number of entries in the hash table exceeds the threshold, the Map is rehashed so that it has approximately twice the number of buckets as before.
  • Rehashing is the process of re-calculating the hash code of already stored entries.
  • Collision occurs when a hash function returns the same bucket location for two different keys.
  • In case of collision; it checks via hashCode and equals method that if both the keys are same.
    If keys are same, replace the value with current value.
    Otherwise, connect this node object to the previous node object via linked list so both are stored at the same index.
  • HashMap initially uses the LinkedList.
    When the number of entries crosses a certain threshold (TREEIFY_THRESHOLD = 8), it will replace a LinkedList with a balanced binary tree (red-black tree).
- put
- get
- getOrDefault
- remove
- containsKey
- containsValue
- isEmpty
- size

Complexity:

- O(1) for insertion and lookup

LinkedHashMap (Java)

  • unique elements with insertion order
  • non-synchronized
  • next pointer required for LinkedList functionality.
  • a Doubly LinkedList is also maintained to handle the insertion order.
  • To maintain insertion order; 2 more pointers (before and after) at every Node.
    before: points to the node inserted before this node
    after: points to the node inserted after this node

Complexity:

- O(1) for insertion and lookup

Example:
Insertion Order:
- Dinesh (calculated index = 5)
- Anamika (calculated index = 2)
- Arnav (calculated index = 2)

Status After Insertion of Arnav:

https://www.dineshonjava.com/internal-working-of-linkedhashmap-in-java/

TreeMap (Java)

  • Does not use hashing for storing key-value pairs.
  • Uses a data structure called the Red-Black Tree — a self-balancing binary search tree (for sorting).
  • a Comparator can be provided at creation time.
  • Node has references to “parent”, “right”, “left” elements.

Complexity:

O(logN) for insertion and lookup

Example:
Insertion Order:
- Ahmedabad (key = H)
- Jaipur (key = D)
- Delhi (key = B)
- Agra (key = F)
- Patna (key = P)

https://www.javatpoint.com/how-treemap-works-internally-in-java

Set

  • Duplicate elements are not allowed.
  • HashSet, TreeSet, LinkedHashSet

HashSet (Java)

  • HashSet class has a private HashMap property.
  • a Set achieves uniqueness internally through HashMap
  • When you call “add” method on HashSet, it actually calls its private HashMap object’s “put” method (as well as “remove”).
  • The elements you add into HashSet are stored as keys of this HashMap object. The value associated with those keys will be a constant.
  • “put” method returns the previous value associated with key, or null if there was no mapping for key.
  • default init size 16, always rounded up to a power of 2
  • not synchronized
- add
- isEmpty
- size
- remove
- contains
- clear
- clone

Complexity:

O(1) for the basic operations (add, remove and contains)

TreeSet (Java)

  • TreeSet is like HashSet, containing unique elements but in a sorted manner.
  • uses TreeMap object internally (and TreeMap uses HashMap internally)
  • The elements are ordered using Red-Black Tree with their natural ordering (default ascending) or by a Comparator provided at creation time.
  • not synchronized
  • no tuning parameters.

Complexity:

O(1) for basic operations (add, remove and contains) assuming the hash function disperses the elements properly among the buckets.

LinkedHashSet (Java)

  • extends HashSet
  • uses LinkedHashMap object internally (to maintain insertion order)

Complexity:

O(1) for basic operations (add, remove and contains)

HashTable (Java)

  • extends Dictionary and implements Map
  • synchronized
  • does not allow null value or key
  • internally contains buckets in which it stores the key/value pairs
  • hashcode to determine to which bucket the key/value pair should map

Complexity:

O(n) for most common operations such as get, put, contains

Happy Coding!

--

--

Nil Seri
Nil Seri

Written by Nil Seri

I would love to change the world, but they won’t give me the source code | coding 👩🏻‍💻 | coffee ☕️ | jazz 🎷 | anime 🐲 | books 📚 | drawing 🎨

Responses (3)