aboutsummaryrefslogtreecommitdiffstats
path: root/gcc-4.4.3/libstdc++-v3/doc/html/ext/pb_ds/hash_zlob_random_int_find_find_timing_test.html
blob: bfbb3b0866f3ba1c44d2e086676a886999798575 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta name="generator" content="HTML Tidy for Linux/x86 (vers 12 April 2005), see www.w3.org" />
<title>Hash Skewed Distribution Memory Use Test</title>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii" />
</head>
<body>
<div id="page">
<h1>Hash-Based Skewed-Distribution Random-Integer <tt>find</tt>
    Find Timing Test</h1>
<h2><a name="description" id="description">Description</a></h2>
<p>This test inserts a number of values with a markedly
    non-uniform i.i.d. integer keys into a container, then performs
    a series of finds using <tt>find</tt> . It measures the average
    time for <tt>find</tt> as a function of the number of values in
    the containers. The keys are generated as follows. First, a
    uniform integer is created; it is then shifted left 8 bits.</p>
<p>(The test was executed with <a href="http://gcc.gnu.org/viewcvs/*checkout*/trunk/libstdc%2B%2B-v3/testsuite/performance/ext/pb_ds/hash_zlob_random_int_find_timing.cc"><tt>hash_zlob_random_int_find_timing_test</tt></a>
    200 200 2100)</p>
<h2><a name="purpose" id="purpose">Purpose</a></h2>
<p>The test checks the effect of different range-hashing
    functions and trigger policies (see <a href="hash_based_containers.html#hash_policies">Design::Associative
    Containers::Hash-Based Containers::Hash Policies</a> and
    <a href="hash_based_containers.html#resize_policies">Design::Associative
    Containers::Hash-Based Containers::Resize Policies</a>).</p>
<h2><a name="results" id="results">Results</a></h2>
<p>Figures <a href="#NHG">NHG</a>, <a href="#NHM">NHM</a>, and
    <a href="#NHL">NHL</a> show the results for various hash-based
    associative-containers in <a href="assoc_performance_tests.html#gcc"><u>g++</u></a>, <a href="assoc_performance_tests.html#msvc"><u>MSVC++</u></a>, and
    <a href="assoc_performance_tests.html#local"><u>local</u></a>,
    respectively.</p>
<div id="NHG_res_div">
<div id="NHG_gcc">
<div id="NHG_hash_zlob_random_int_find_timing_test">
<div id="NHG_assoc">
<div id="NHG_Skewed-distribution_random_int_find_timing_test_using__tt_find_455tt_"><div style="border-style: dotted; border-width: 1px; border-color: lightgray"><h6 class="c1"><a name="NHG" id="NHG"><img src="hash_zlob_random_int_find_timing_test_gcc.png" alt="no image" /></a></h6>NHG: Skewed-distribution random int find timing test using <tt>find</tt> - <a href="assoc_performance_tests.html#gcc">g++</a><p>In the above figure, the names in the legends have the following meaning:</p>
<ol>
<li>
cc_hash_mask_exp_nea_lc_1div8_1div1_nsth_map-
<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a>
with <tt>Comb_Hash_Fn</tt> = <a href="direct_mask_range_hashing.html"><tt>direct_mask_range_hashing</tt></a>
, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
 with <tt>Size_Policy</tt> = <a href="hash_exponential_size_policy.html"><tt>hash_exponential_size_policy</tt></a>
, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
 with <i>&alpha;<sub>min</sub></i> = <i>1/8</i> and <i>&alpha;<sub>max</sub></i> = <i>1/1</i></li>
<li>
gp_hash_mod_quadp_prime_nea_lc_1div8_1div2_nsth_map-
<a href="gp_hash_table.html"><tt>gp_hash_table</tt></a>
 with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a>
, <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
 with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a>
, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
 with <i>&alpha;<sub>min</sub></i> = <i>1/8</i> and <i>&alpha;<sub>max</sub></i> = <i>1/2</i>, and <tt>Probe_Fn</tt> = <a href="quadratic_probe_fn.html"><tt>quadratic_probe_fn</tt></a>
</li>
<li>
n_hash_map_ncah-
<tt>std::tr1::unordered_map</tt> with <tt>cache_hash_code</tt> = <tt><b>false</b></tt></li>
<li>
cc_hash_mod_prime_nea_lc_1div8_1div1_nsth_map-
<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a>
with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a>
, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
 with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a>
, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
 with <i>&alpha;<sub>min</sub></i> = <i>1/8</i> and <i>&alpha;<sub>max</sub></i> = <i>1/1</i></li>
</ol>
</div><div style="width: 100%; height: 20px"></div></div>
</div>
</div>
</div>
</div>
<div id="NHM_res_div">
<div id="NHM_msvc">
<div id="NHM_hash_zlob_random_int_find_timing_test">
<div id="NHM_assoc">
<div id="NHM_Skewed-distribution_random_int_find_timing_test_using__tt_find_455tt_"><div style="border-style: dotted; border-width: 1px; border-color: lightgray"><h6 class="c1"><a name="NHM" id="NHM"><img src="hash_zlob_random_int_find_timing_test_msvc.png" alt="no image" /></a></h6>NHM: Skewed-distribution random int find timing test using <tt>find</tt> - <a href="assoc_performance_tests.html#msvc">msvc++</a><p>In the above figure, the names in the legends have the following meaning:</p>
<ol>
<li>
n_hash_map_ncah-
<tt>stdext::hash_map</tt></li>
<li>
cc_hash_mask_exp_nea_lc_1div8_1div1_nsth_map-
<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a>
with <tt>Comb_Hash_Fn</tt> = <a href="direct_mask_range_hashing.html"><tt>direct_mask_range_hashing</tt></a>
, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
 with <tt>Size_Policy</tt> = <a href="hash_exponential_size_policy.html"><tt>hash_exponential_size_policy</tt></a>
, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
 with <i>&alpha;<sub>min</sub></i> = <i>1/8</i> and <i>&alpha;<sub>max</sub></i> = <i>1/1</i></li>
<li>
gp_hash_mod_quadp_prime_nea_lc_1div8_1div2_nsth_map-
<a href="gp_hash_table.html"><tt>gp_hash_table</tt></a>
 with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a>
, <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
 with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a>
, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
 with <i>&alpha;<sub>min</sub></i> = <i>1/8</i> and <i>&alpha;<sub>max</sub></i> = <i>1/2</i>, and <tt>Probe_Fn</tt> = <a href="quadratic_probe_fn.html"><tt>quadratic_probe_fn</tt></a>
</li>
<li>
cc_hash_mod_prime_nea_lc_1div8_1div1_nsth_map-
<a href="cc_hash_table.html"><tt>cc_hash_table</tt></a>
with <tt>Comb_Hash_Fn</tt> = <a href="direct_mod_range_hashing.html"><tt>direct_mod_range_hashing</tt></a>
, and <tt>Resize_Policy</tt> = <a href="hash_standard_resize_policy.html"><tt>hash_standard_resize_policy</tt></a>
 with <tt>Size_Policy</tt> = <a href="hash_prime_size_policy.html"><tt>hash_prime_size_policy</tt></a>
, and <tt>Trigger_Policy</tt> = <a href="hash_load_check_resize_trigger.html"><tt>hash_load_check_resize_trigger</tt></a>
 with <i>&alpha;<sub>min</sub></i> = <i>1/8</i> and <i>&alpha;<sub>max</sub></i> = <i>1/1</i></li>
</ol>
</div><div style="width: 100%; height: 20px"></div></div>
</div>
</div>
</div>
</div>
<div id="NHL_res_div">
<div id="NHL_local">
<div id="NHL_hash_zlob_random_int_find_timing_test">
<div id="NHL_assoc">
<div id="NHL_Skewed-distribution_random_int_find_timing_test_using__tt_find_455tt_"><div style = "border-style: dotted; border-width: 1px; border-color: lightgray"><h6 class="c1"><a name="NHL" id= "NHL"><img src="hash_zlob_random_int_find_timing_test_local.png" alt="no image" /></a></h6>NHL: Skewed-distribution random int find timing test using <tt>find</tt> - <a href = "assoc_performance_tests.html#local">local</a></div><div style = "width: 100%; height: 20px"></div></div>
</div>
</div>
</div>
</div>
<h2><a name="observations" id="observations">Observations</a></h2>
<p>In this setting, the keys' distribution is so skewed that
    the unerlying hash-table type affects performance marginally.
    (This is in contrast with <a href="hash_text_find_find_timing_test.html">Hash-Based Text
    <tt>find</tt> Find Timing Test</a> , <a href="hash_random_int_find_find_timing_test.html">Hash-Based
    Random-Integer <tt>find</tt> Find Timing Test</a> , <a href="hash_random_int_subscript_find_timing_test.html">Hash-Based
    Random-Integer Subscript Find Timing Test</a> and <a href="hash_random_int_subscript_insert_timing_test.html">Hash-Based
    Random-Integer Subscript Insert Timing Test</a> .)</p>
<p>The range-hashing scheme affects performance dramatically. A
    mask-based range-hashing scheme effectively maps all values
    into the same bucket. Access degenerates into a search within
    an unordered linked-list. In Figures <a href="#NHG">NHG</a> and
    <a href="#NHM">NHM</a> , it should be noted that
    <tt>std::tr1::unordered_map</tt> and <tt>stdext::hash_map</tt>
    are hard-wired currently to mod-based and mask-based schemes,
    respectively.</p>
<p>When observing the settings of this test, it is apparent
    that the keys' distribution is far from natural. One might ask
    if the test is not contrived to show that, in some cases,
    mod-based range hashing does better than mask-based range
    hashing. This is, in fact just the case. We did not encounter a
    more natural case in which mod-based range hashing is better.
    In our opnion, real-life key distributions are handled better
    with an appropriate hash function and a mask-based
    range-hashing function. (<a href="http://gcc.gnu.org/viewcvs/*checkout*/trunk/libstdc%2B%2B-v3/testsuite/ext/pb_ds/example/hash_shift_mask.cc"><tt>shift_mask.cc</tt></a>
    shows an example of handling this a-priori known skewed
    distribution with a mask-based range-hashing function). If hash
    performance is bad, a <i>&Chi;<sup>2</sup></i> test can be used
    to check how to transform it into a more uniform
    distribution.</p>
<p>For this reason, <tt>pb_ds</tt>'s default range-hashing
    function is mask-based.</p>
<p><a href="assoc_performance_tests.html#hash_based_types">Observations::Hash-Based
    Container Types</a> summarizes some observations on hash-based
    containers; <a href="assoc_performance_tests.html#hash_based_policies">Observations::Hash-Based
    Containers' Policies</a> summarizes some observations on
    hash-based containers' policies.</p>
</div>
</body>
</html>