57_[자바] Set

Set

A Set refuses to hold more than one instance of each object value. If you try to add more than one instance of an equivalent object, the Set prevents duplication. The most common use for a Set is to test for membership, so that you can easily ask whether an object is in a Set. Because of this, lookup is typically the most important operation for a Set, so you’ll usually choose a HashSet implementation, which is optimized for rapid lookup.

Set has the same interface as Collection, so there isn’t any extra functionality like there is in the two different types of List. Instead, the Set is exactly a Collection—it just has different behavior. (This is the ideal use of inheritance and polymorphism: to express different behavior.) A Set determines membership based on the "value" of an object, a more complex topic that you will learn about in the Containers in Depth chapter.

Here’s an example that uses a HashSet with Integer objects:

//: holding/SetOfInteger.java
import java.util.*;
public class SetOfInteger {
  public static void main(String[] args) {
    Random rand = new Random(47);
    Set<Integer> intset = new HashSet<Integer>();
    for(int i = 0; i < 10000; i++)
      intset.add(rand.nextInt(30));
    System.out.println(intset);
  }
} /* Output:
[15, 8, 23, 16, 7, 22, 9, 21, 6, 1, 29, 14, 24, 4, 19, 26, 11, 18, 3, 12, 27, 17, 2, 13, 28, 20, 25, 10, 5, 0]
*///:~

Ten thousand random numbers from o up to 29 are added to the Set, so you can imagine that each value has many duplications. And yet you can see that only one instance of each appears in the result.

You’ll also notice that the output is in no discernible order. This is because a HashSet uses hashing for speed—hashing is covered in the Containers in Depth chapter. The order maintained by a HashSet is different from a TreeSet or a LinkedHashSet, since each implementation has a different way of storing elements. TreeSet keeps elements sorted into a red-black tree data structure, whereas HashSet uses the hashing function. LinkedHashSet also uses hashing for lookup speed, but appears to maintain elements in insertion order using a linked list.

If you want the results to be sorted, one approach is to use a TreeSet instead of a HashSet:

//: holding/SortedSetOfInteger.java
import java.util.*;
public class SortedSetOfInteger {
  public static void main(String[] args) {
    Random rand = new Random(47);
    SortedSet<Integer> intset = new TreeSet<Integer>();
    for(int i = 0; i < 10000; i++)
      intset.add(rand.nextInt(30));
    System.out.println(intset);
  }
} /* Output:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
*///:~

One of the most common operations you will perform is a test for set membership using contains( ), but there are also operations that will remind you of the Venn diagrams you may have been taught in elementary school:

//: holding/SetOperations.java
import java.util.*;
import static net.mindview.util.Print.*;
public class SetOperations {
  public static void main(String[] args) {
    Set<String> set1 = new HashSet<String>();
    Collections.addAll(set1,
      "A B C D E F G H I J K L".split(" "));
    set1.add("M");
    print("H: " + set1.contains("H"));
    print("N: " + set1.contains("N"));
    Set<String> set2 = new HashSet<String>();
    Collections.addAll(set2, "H I J K L".split(" "));
    print("set2 in set1: " + set1.containsAll(set2));
    set1.remove("H");
    print("set1: " + set1);
    print("set2 in set1: " + set1.containsAll(set2));
    set1.removeAll(set2);
    print("set2 removed from set1: " + set1);
    Collections.addAll(set1, "X Y Z".split(" "));
    print("'X Y Z' added to set1: " + set1);
  }
} /* Output:
H: true
N: false
set2 in set1: true
set1: [D, K, C, B, L, G, I, M, A, F, J, E]
set2 in set1: false
set2 removed from set1: [D, C, B, G, M, A, F, E]
'X Y Z' added to set1: [Z, D, C, B, G, M, A, F, Y, X, E]
*///:~

The method names are self-explanatory, and there are a few more that you will find in the JDK documentation.

Producing a list of unique elements can be quite useful. For example, suppose you’d like to list all the words in the file SetOperations.java, above. Using the net.mindview.TextFile utility that will be introduced later in the book, you can open and read a file into a Set:

//: holding/UniqueWords.java
import java.util.*;
import net.mindview.util.*;
public class UniqueWords {
  public static void main(String[] args) {
    Set<String> words = new TreeSet<String>(
      new TextFile("SetOperations.java", "\\W+"));
    System.out.println(words);
  }
} /* Output:
[A, B, C, Collections, D, E, F, G, H, HashSet, I, J, K, L, M, N, Output, Print, Set, SetOperations, String, X, Y, Z, add, addAll, added, args, class, contains, containsAll, false, from, holding, import, in, java, main, mindview, net, new, print, public, remove, removeAll, removed, set1, set2, split, static, to, true, util, void]
*///:~

TextFile is inherited from List<String>. The TextFile constructor opens the file and breaks it into words according to the regular expression "\\W+", which means "one or more letters" (regular expressions are introduced in the Strings chapter). The result is handed to the TreeSet constructor, which adds the contents of the List to itself. Since it is a TreeSet, the result is sorted. In this case, the sorting is done lexicographically so that the uppercase and lowercase letters are in separate groups. If you’d like to sort it alphabetically, you can pass the String.CASE_INSENSITIVE_ORDER Comparator (a comparator is an object that establishes order) to the TreeSet constructor:

[Thinking in Java, 292]

댓글