doc/README.wmem


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337

$Id$

1. Introduction

NB: Wmem still does not provide all of the functionality of emem
    (see README.malloc), although it should provide most of it. New code
    may still need to use emem for the time being.

The 'emem' memory manager (described in README.malloc) has been a part of
Wireshark since 2005 and has served us well, but is starting to show its age.
The framework has become increasingly difficult to maintain, and limitations
in the API have blocked progress on other long-term goals such as multi-
threading, and opening multiple files at once.

The 'wmem' memory manager is an attempt to write a new memory management
framework to replace emem. It provides a significantly updated API, a more
modular design, and it isn't all jammed into one 2500-line file.

Wmem was originally conceived in this email to the wireshark-dev mailing list:
https://www.wireshark.org/lists/wireshark-dev/201210/msg00178.html

The wmem code can now be found in epan/wmem/ in the Wireshark source tree.

2. Usage for Consumers

If you're writing a dissector, or other "userspace" code, then using wmem
should be very similar to using emem. All you need to do is include the header
(epan/wmem/wmem.h) and get a handle to a memory pool (if you want to *create*
a memory pool, see the section "3. Usage for Producers" below).

A memory pool is an opaque pointer to an object of type wmem_allocator_t, and
it is the very first parameter passed to almost every call you make to wmem.
Other than that parameter (and the fact that functions are prefixed wmem_
instead of ep_ or se_) usage is exactly like that of emem. For example:

    wmem_alloc(myPool, 20);

allocates 20 bytes in the pool pointed to by myPool.

2.1 Available Pools

2.1.1 (Sort Of) Global Pools

Dissectors that include the wmem header file will have three pools available
to them automatically: wmem_packet_scope(), wmem_file_scope() and
wmem_epan_scope();

The packet pool is scoped to the dissection of each packet, replacing
emem's ep_ allocators. The file pool is scoped to the dissection of each file,
replacing emem's se_ allocators. For example:

    ep_malloc(32);
    se_malloc(sizeof(guint));

could be replaced with

    wmem_alloc(wmem_packet_scope(), 32);
    wmem_alloc(wmem_file_scope(),   sizeof(guint));

NB: Using these pools outside of the appropriate scope (eg using the packet
    pool when there isn't a packet being dissected) will throw an assertion.
    See the comment in epan/wmem/wmem_scopes.c for details.

The epan pool is scoped to the library's lifetime - memory allocated in it is
not freed until epan_cleanup() is called, which is typically at the end of the
program. 

2.1.2 Pinfo Pool

Certain places (such as AT_STRINGZ address allocations) need their memory to
stay around a little longer than the usual packet scope - basically until the
next packet is dissected. This is effectively the scope of Wireshark's pinfo
structure, so the pinfo struct has a 'pool' member which is a wmem pool scoped
to the lifetime of the pinfo struct.

2.2 Core API

 - wmem_alloc
 - wmem_alloc0
 - wmem_new
 - wmem_new0

2.3 String Utilities

 - wmem_strdup
 - wmem_strndup
 - wmem_strdup_printf
 - wmem_strdup_vprintf

2.4 Stack

 - wmem_stack_new
 - wmem_stack_push
 - wmem_stack_pop
 - wmem_stack_peek
 - wmem_stack_count

2.5 Singly-Linked List

 - wmem_slist_new
 - wmem_slist_prepend
 - wmem_slist_remove
 - wmem_slist_front
 - wmem_slist_frame_next
 - wmem_slist_frame_data
 - wmem_slist_count

2.6 Slab

 - wmem_slab_new
 - wmem_slab_alloc
 - wmem_slab_free

2.7 String-Buffers

 - wmem_strbuf_new
 - wmem_strbuf_sized_new
 - wmem_strbuf_append
 - wmem_strbuf_append_printf
 - wmem_strbuf_get_str
 - wmem_strbuf_get_len

3. Usage for Producers

NB: If you're just writing a dissector, you probably don't need to read
    this section.

One of the problems with the old emem framework was that there were basically
two allocator backends (glib and mmap) that were all mixed together in a mess
of if statements, environment variables and #ifdefs. In wmem the different
allocator backends are cleanly separated out, and it's up to the owner of the
pool to pick one.

3.1 Available Allocator Back-Ends

Each available allocator type has a corresponding entry in the
wmem_allocator_type_t enumeration defined in wmem_core.h.

The currently available allocators are:
 - WMEM_ALLOCATOR_SIMPLE (wmem_allocator_simple.*)
        A trivial allocator that g_allocs requested memory and tracks
        allocations via a GHashTable.
 - WMEM_ALLOCATOR_BLOCK (wmem_allocator_block.*)
        A block allocator that grabs large chunks of memory at a time
        (8 MB currently) and serves allocations out of those chunks.
 - WMEM_ALLOCATOR_STRICT (wmem_allocator_strict.*)
        An allocator that does its best to find invalid memory usage via
        things like canaries and scrubbing freed memory. Valgrind is the
        better choice on platforms that support it.

3.2 Creating a Pool

To create a pool, include the regular wmem header and call the
wmem_allocator_new() function with the appropriate type value.
For example:

    #include "wmem/wmem.h"

    wmem_allocator_t *myPool;
    myPool = wmem_allocator_new(WMEM_ALLOCATOR_SIMPLE);

From here on in, you don't need to remember which type of allocator you used
(although allocator authors are welcome to expose additional allocator-specific
helper functions in their headers). The "myPool" variable can be passed around
and used as normal in allocation requests as described in section 2 of this
document.

Note that the type of pool you request won't always be the type you get - the
WIRESHARK_DEBUG_WMEM_OVERRIDE environment variable (if set) can be used to
force all calls to wmem_allocator_new() to returns a specific type for debugging
purposes. It will always be safe to call allocator-specific helpers though, they
will simply be no-ops if the type doesn't match.

3.3 Destroying a Pool

Regardless of which allocator you used to create a pool, it can be destroyed
with a call to the function wmem_destroy_allocator(). For example:

    #include "wmem/wmem.h"

    wmem_allocator_t *myPool;

    myPool = wmem_allocator_new(WMEM_ALLOCATOR_SIMPLE);

    /* Allocate some memory in myPool ... */

    wmem_destroy_allocator(myPool);

Destroying a pool will free all the memory allocated in it.

3.4 Reusing a Pool

It is possible to free all the memory in a pool without destroying it,
allowing it to be reused later. Depending on the type of allocator, doing this
(by calling wmem_free_all()) can be significantly cheaper than fully destroying
and recreating the pool. This method is therefore recommended, especially when
the pool would otherwise be scoped to a single iteration of a loop. For example:

    #include "wmem/wmem.h"

    wmem_allocator_t *myPool;

    myPool = wmem_allocator_new(WMEM_ALLOCATOR_SIMPLE);
    for (...) {

        /* Allocate some memory in myPool ... */

        /* Free the memory, faster than destroying and recreating
           the pool each time through the loop. */
        wmem_free_all(myPool);
    }
    wmem_destroy_allocator(myPool);

4. Internal Design

Despite being written in Wireshark's standard C90, wmem follows a fairly
object-oriented design pattern. Although efficiency is always a concern, the
primary goals in writing wmem were maintainability and preventing memory
leaks.

4.1 struct _wmem_allocator_t

The heart of wmem is the _wmem_allocator_t structure defined in the
wmem_allocator.h header file. This structure uses C function pointers to
implement a common object-oriented design pattern known as an interface (also
known as an abstract class to those who are more familiar with C++).

Different allocator implementations can provide exactly the same interface by
assigning their own functions to the members of an instance of the structure.
The structure has eight members in three groups.

4.1.1 Implementation Details

 - private_data
 - type

The private_data pointer is a void pointer that the allocator implementation can
use to store whatever internal structures it needs. A pointer to private_data is
passed to almost all of the other functions that the allocator implementation
must define.

The type field is an enumeration of type wmem_allocator_type_t (see
section 3.1). Its value is set by the wmem_allocator_new() function, not
by the implementation-specific constructor. This field should be considered
read-only by the allocator implementation.

4.1.2 Consumer Functions

 - alloc()
 - realloc()
 - free()

These function pointers should be set to functions with semantics obviously
similar to their standard-library namesakes. Each one takes an extra parameter
that is a copy of the allocator's private_data pointer.

Note that realloc() and free() are not expected to be called directly by user
code in most cases - they are primarily optimisations for use by data
structures that wmem might want to implement (it's hard, for example, to
implement a dynamically sized array without some form of realloc).

Also note that allocators do not have to handle NULL pointers or 0-length
requests in any way - those checks are done in an allocator-agnostic way
higher up in wmem. Allocator authors can assume that all incoming pointers
(to realloc and free) are non-NULL, and that all incoming lengths (to malloc
and realloc) are non-0.

4.1.3 Producer/Manager Functions

 - free_all()
 - gc()
 - destroy()

The free_all() function takes the private_data pointer and should free all the
memory currently allocated in the pool. Note that this is not necessarilly
exactly the same as calling free() on all the allocated blocks - free_all() is
allowed to do additional cleanup or to make use of optimizations not available
when freeing one block at a time.

The gc() function takes the private_data pointer and should do whatever it can
to reduce excess memory usage in the dissector by returning unused blocks to
the OS, optimizing internal data structures, etc.

The destroy() function does NOT take the private_data pointer - it instead takes
a pointer to the allocator structure as a whole, since that structure may also
need freeing. This function can assume that free_all() has been called
immediately before it (though it can make no assumptions about whether or not
gc() has ever been called).

4.2 Pool-Agnostic API

One of the issues with emem was that the API (including the public data
structures) required wrapper functions for each scope implemented. Even
if there was a stack implementation in emem, it wasn't necessarily available
for use with file-scope memory unless someone took the time to write se_stack_
wrapper functions for the interface.

In wmem, all public APIs take the pool as the first argument, so that they can
be written once and used with any available memory pool. Data structures like
wmem's stack implementation only take the pool when created - the provided
pointer is stored internally with the data structure, and subsequent calls
(like push and pop) will take the stack itself instead of the pool.

5. TODO List

The following is a list of things that wmem provides but are incomplete
(i.e. missing common operations):

 - string buffers
 - singly-linked list

The following is an incomplete list of things that emem provides but wmem has
not yet implemented:

 - red-black tree
 - tvb_memdup

The following is a list of things that emem doesn't provide but that it might
be nice if wmem did provide them:

 - radix tree
 - dynamic array
 - hash table
 - realloc

/*
 * Editor modelines  -  http://www.wireshark.org/tools/modelines.html
 *
 * Local variables:
 * c-basic-offset: 4
 * tab-width: 8
 * indent-tabs-mode: nil
 * End:
 *
 * vi: set shiftwidth=4 tabstop=8 expandtab:
 * :indentSize=4:tabSize=8:noTabs=true:
 */