4 This script reads in the profile-raw file generated by Glulxe profiling,
5 and lists the ten most costly functions. (In terms of how much total time
6 was spent inside each function. If a function calls other functions, the
7 time spent in them is not charged to the parent; that is, a function
8 which does nothing but call other functions will be considered uncostly.)
10 Optionally, this script can also read the debug output of the Inform 6
11 compiler (or the assembly output), and use that to figure out the
12 names of all the functions that were profiled.
14 You can also generate profiling output in the same form as dumbfrotz's
15 Z-machine profiling output. (If that happens to be what you want.) Use
16 the --dumbfrotz argument.
18 Using this script is currently a nuisance. The requirements:
20 - You must compile Glulxe with profiling (the VM_PROFILING compile-time
22 - (If you want function names) you should compile your Inform 6 source
23 using the -k switch. This generates a "gameinfo.dbg" file.
24 - Run Glulxe, using the "--profile profile-raw" option. Play some of
25 the game, and quit. This generates a data file called "profile-raw".
26 - Run this script, giving gameinfo.dbg and profile-raw as arguments.
27 (You can provide dispatch_dump.xml as an optional third argument.
28 This file gives the names of Glk functions; it is available from
29 https://github.com/erkyrath/glk-dev/tree/master/dispatch_dump .)
31 To sum up, in command-line form:
33 % inform -G -k game.inf
34 % glulxe --profile profile-raw game.ulx
35 % python profile-analyze.py profile-raw gameinfo.dbg dispatch_dump.xml
37 You can also use the assembly output of the Inform compiler, which you
38 get with the -a switch. Save the output and use it instead of the debug
41 % inform -G -a game.inf > game.asm
42 % glulxe --profile profile-raw game.ulx
43 % python profile-analyze.py profile-raw game.asm dispatch_dump.xml
47 The profiling code is not smart about VM operations that rearrange the
48 call stack. In fact, it's downright stupid. @restart, @restore,
49 @restoreundo, or @throw will kill the interpreter.
51 Inform's -k switch does not work correctly with game files larger than
54 Inform's -a switch does not display code for veneer functions, so if
55 you use that data, these will not be named; they will be listed as
56 "<???>". This is a particular nuisance because veneer functions are
57 often the most costly ones. (Therefore, you'll almost certainly want
60 If you leave off the "dispatch_dump.xml" argument, everything will
61 still work, but @glk function entries will be listed by number rather
64 You can explore the profiling data in more detail by running the script
67 % python -i profile-analyze.py profile-raw game.asm dispatch_dump.xml
69 After it runs, you'll be left at a Python prompt. The environment
70 will contain mappings called "functions" (mapping addresses to
71 function objects), and "function_names" (names to function objects).
74 <Function $3c 'Main__'>
75 >>> function_names['Main__']
76 <Function $3c 'Main__'>
77 >>> function_names['Main__'].dump()
79 at $00003c (line 0); called 1 times
80 0.000067 sec (1 ops) spent executing
81 6.273244 sec (117578 ops) including child calls
83 A Function object has lots of attributes:
85 addr=INT: The VM address of the function (in hex).
86 hexaddr=STRING: The VM address of the function in hex (as a string).
87 name=STRING: The function's name, or '<???>' if the function is
88 not known (veneer functions).
89 linenum=INT: The line number of the function from the source code,
90 or 0 if it is not derived from source (Main__, etc).
91 call_count=INT: The number of times the function was called.
92 accel_count=INT: The number of times the function was called with
94 total_time=FLOAT: The amount of time spent during all calls to the
95 function (in seconds, as a floating-point value).
96 total_ops=INT: The number of opcodes executed during all calls to
98 self_time=FLOAT: The amount of time spent during all calls to the
99 function, excluding time spent in subcalls (functions called *by* the
101 self_ops=INT: The number of opcodes executed during all calls to
102 the function, excluding time spent in subcalls.
103 max_depth=INT: The deepest this function has been nested on the
104 stack, during any call.
105 max_stack_use=INT: The greatest number of words on the stack, during
106 any call. (This is measured when the function returns, so it may
107 not capture the peak stack usage. If a function never returns, e.g.
108 Main__(), then this value is approximate.)
110 (The self_time is the "cost" used for the original listing.)
112 Note that if a function does not make any function calls, total_time
113 will be the same as self_time (and total_ops the same as self_ops).
115 Some of the function entries refer to special interpreter operations.
116 (These have high addresses, outside the range of normal game files.)
117 Functions with addresses in the 0xE0000000 range are the interpreter's
118 output opcodes: @streamchar, @streamunichar, @streamnum, @streamstr.
120 Functions with addresses in the 0xF0000000 range are @glk opcode calls.
121 The number in the lower bits specifies which Glk function was called.
122 You will always see a large self_time for function 0xF00000C0; this
123 represents all the time spent waiting for input in glk_select().
125 (Both the 0xE0000000 and 0xF0000000 entries represent time spent in the
126 Glk library, but they get there by different code paths.)
128 The function with the lowest address is the top-level Main__()
129 function generated by the compiler. Its total_time is the running time
130 of the entire program; its total_ops is the number of opcodes executed
131 by the entire program; its max_depth is zero.
137 from struct import unpack
139 dumb_frotz_mode = False
141 if ('--dumbfrotz' in sys.argv):
142 sys.argv.remove('--dumbfrotz')
143 dumb_frotz_mode = True
145 if (len(sys.argv) < 2):
146 print "Usage: profile-analyze.py [--dumbfrotz] profile-raw [ gameinfo.dbg | game.asm ] [ dispatch_dump.xml ]"
149 profile_raw = sys.argv[1]
150 if (not os.path.exists(profile_raw)):
151 print 'File not readable:', profile_raw
155 if (len(sys.argv) >= 3):
156 game_asm = sys.argv[2]
157 if (not os.path.exists(game_asm)):
158 print 'File not readable:', game_asm
162 if (len(sys.argv) >= 4):
163 dispatch_dump = sys.argv[3]
164 if (not os.path.exists(dispatch_dump)):
165 print 'File not readable:', dispatch_dump
168 special_functions = {
169 0xE0000001: 'streamchar',
170 0xE0000002: 'streamunichar',
171 0xE0000003: 'streamnum',
172 0xE0000004: 'streamstr',
181 def __init__(self, addr, hexaddr, attrs):
183 self.hexaddr = hexaddr
184 val = special_functions.get(addr)
185 if (addr >= 0xF0000000):
186 name = glk_functions.get(addr-0xF0000000)
188 name = hex(addr-0xF0000000)[2:]
189 name = '$' + name.replace('L', '')
190 self.name = '<@glk_' + name + '>'
196 self.name = '<@' + val + '>'
199 self.call_count = int(attrs['call_count'])
201 if (attrs.has_key('accel_count')):
202 self.accel_count = int(attrs['accel_count'])
203 self.total_ops = int(attrs['total_ops'])
204 self.total_time = float(attrs['total_time'])
205 self.self_ops = int(attrs['self_ops'])
206 self.self_time = float(attrs['self_time'])
207 if (attrs.has_key('max_depth')):
208 self.max_depth = int(attrs['max_depth'])
209 if (attrs.has_key('max_stack_use')):
210 self.max_stack_use = int(attrs['max_stack_use'])
213 return '<Function $' + self.hexaddr + ' ' + repr(self.name) + '>'
216 print '%s:' % (self.name,)
218 if (self.accel_count):
219 val = ' (%d accelerated)' % (self.accel_count,)
220 print ' at $%06x (line %d); called %d times%s' % (self.addr, self.linenum,self.call_count,val)
221 print ' %.6f sec (%d ops) spent executing' % (self.self_time, self.self_ops)
222 print ' %.6f sec (%d ops) including child calls' % (self.total_time, self.total_ops)
224 def dump_dumbfrotz_style(self):
227 pc1 = int(100*(float(self.self_ops)/float(ops_executed)))
228 pc2 = int(100*(float(self.total_ops)/float(ops_executed)))
230 percent1 = "%3d%%" % pc1
232 percent2 = "%3d%%" % pc2
233 print '%-36s %s %-10lu %s %-10lu %-10lu %-4d' % (self.name, percent1, self.self_ops, percent2, self.total_ops, self.call_count, self.max_depth)
235 class DispatchDumpHandler(xml.sax.handler.ContentHandler):
236 def startElement(self, name, attrs):
237 if (name == 'function'):
238 addr = int(attrs['id'])
239 glk_functions[addr] = str(attrs['name'])
241 class ProfileRawHandler(xml.sax.handler.ContentHandler):
242 def startElement(self, name, attrs):
245 if (name == 'profile'):
247 if (name == 'function'):
248 hexaddr = attrs.get('addr')
249 addr = int(hexaddr, 16)
250 func = Function(addr, hexaddr, attrs)
251 functions[addr] = func
264 if (lasttup and not ls):
265 (linenum, funcname, addr) = lasttup
266 sourcemap[addr] = (linenum, funcname)
269 if (len(ls) >= 4 and ls[2] == '[' and ls[1].startswith('+')):
272 addr = int(ls[1][1:], 16)
273 lasttup = (linenum, funcname, addr)
278 def __init__(self, funcnum):
279 self.funcnum = funcnum
284 self.endlinenum = None
288 return '<InformFunc $' + hex(self.addr)[2:] + ' ' + repr(self.name) + '>'
291 def __init__(self, fl):
294 self.function_names = {}
302 self.fake_actions = {}
307 val = unpack('>H', dat)[0]
309 raise ValueError('not an Inform debug file')
312 self.debugversion = unpack('>H', dat)[0]
314 self.informversion = unpack('>H', dat)[0]
317 1: self.read_file_rec,
318 2: self.read_class_rec,
319 3: self.read_object_rec,
320 4: self.read_global_rec,
321 5: self.read_attr_rec,
322 6: self.read_prop_rec,
323 7: self.read_fake_action_rec,
324 8: self.read_action_rec,
325 9: self.read_header_rec,
326 10: self.read_lineref_rec,
327 11: self.read_routine_rec,
328 12: self.read_array_rec,
329 13: self.read_map_rec,
330 14: self.read_routine_end_rec,
335 rectype = unpack('>B', dat)[0]
338 recfunc = rectable.get(rectype)
340 raise ValueError('unknown debug record type: %d' % (rectype,))
343 for func in self.functions.values():
344 self.function_names[func.name] = func
346 def read_file_rec(self, fl):
348 filenum = unpack('>B', dat)[0]
349 includename = self.read_string(fl)
350 realname = self.read_string(fl)
351 self.files[filenum] = ( includename, realname )
353 def read_class_rec(self, fl):
354 name = self.read_string(fl)
355 start = self.read_linenum(fl)
356 end = self.read_linenum(fl)
357 self.classes.append( (name, start, end) )
359 def read_object_rec(self, fl):
361 num = unpack('>H', dat)[0]
362 name = self.read_string(fl)
363 start = self.read_linenum(fl)
364 end = self.read_linenum(fl)
365 self.objects[num] = (name, start, end)
367 def read_global_rec(self, fl):
369 num = unpack('>B', dat)[0]
370 name = self.read_string(fl)
371 self.globals[num] = name
373 def read_array_rec(self, fl):
375 num = unpack('>H', dat)[0]
376 name = self.read_string(fl)
377 self.arrays[num] = name
379 def read_attr_rec(self, fl):
381 num = unpack('>H', dat)[0]
382 name = self.read_string(fl)
383 self.attributes[num] = name
385 def read_prop_rec(self, fl):
387 num = unpack('>H', dat)[0]
388 name = self.read_string(fl)
389 self.properties[num] = name
391 def read_action_rec(self, fl):
393 num = unpack('>H', dat)[0]
394 name = self.read_string(fl)
395 self.actions[num] = name
397 def read_fake_action_rec(self, fl):
399 num = unpack('>H', dat)[0]
400 name = self.read_string(fl)
401 self.fake_actions[num] = name
403 def read_routine_rec(self, fl):
405 funcnum = unpack('>H', dat)[0]
406 func = self.get_function(funcnum)
408 func.linenum = self.read_linenum(fl)
410 addr = unpack('>I', '\0'+dat)[0]
411 func.addr = int(addr)
412 func.name = self.read_string(fl)
415 val = self.read_string(fl)
421 def read_lineref_rec(self, fl):
423 funcnum = unpack('>H', dat)[0]
424 func = self.get_function(funcnum)
426 if (not func.seqpts):
430 count = unpack('>H', dat)[0]
431 for ix in range(count):
432 linenum = self.read_linenum(fl)
434 addr = unpack('>H', dat)[0]
435 func.seqpts.append( (linenum, addr) )
437 def read_routine_end_rec(self, fl):
439 funcnum = unpack('>H', dat)[0]
440 func = self.get_function(funcnum)
442 func.endlinenum = self.read_linenum(fl)
444 addr = unpack('>I', '\0'+dat)[0]
445 func.endaddr = int(addr)
447 def read_header_rec(self, fl):
451 def read_map_rec(self, fl):
453 name = self.read_string(fl)
457 addr = unpack('>I', '\0'+dat)[0]
459 self.map[name] = addr
461 def read_linenum(self, fl):
463 (funcnum, linenum, charnum) = unpack('>BHB', dat)
464 return (funcnum, linenum, charnum)
466 def read_string(self, fl):
474 def get_function(self, funcnum):
475 func = self.functions.get(funcnum)
477 func = InformFunc(funcnum)
478 self.functions[funcnum] = func
484 xml.sax.parse(dispatch_dump, DispatchDumpHandler())
486 xml.sax.parse(profile_raw, ProfileRawHandler())
488 source_start = min([ func.addr for func in functions.values()
489 if not func.special ])
491 if (not dumb_frotz_mode):
492 print 'Code segment begins at', hex(source_start)
493 print len(functions), 'called functions found in', profile_raw
496 fl = open(game_asm, 'rb')
501 elif (val == '\xde\xbf'):
502 fl = open(game_asm, 'rb')
503 debugfile = DebugFile(fl)
506 for func in debugfile.functions.values():
507 sourcemap[func.addr] = ( func.linenum[1], func.name)
509 fl = open(game_asm, 'rU')
516 for (addr, func) in functions.items():
519 tup = sourcemap.get(addr-source_start)
523 (linenum, funcname) = tup
525 func.linenum = linenum
527 if (not dumb_frotz_mode):
529 print len(badls), 'functions from', profile_raw, 'did not appear in asm (veneer functions)'
532 for func in functions.values():
533 function_names[func.name] = func
536 uncalled_funcs = [ funcname for (addr, (linenum, funcname)) in sourcemap.items() if (addr+source_start) not in functions ]
537 if (not dumb_frotz_mode):
538 print len(uncalled_funcs), 'functions found in', game_asm, 'were never called'
540 if (dumb_frotz_mode):
541 ls = functions.values()
542 ls.sort(lambda x1, x2: cmp(x2.total_ops, x1.total_ops))
545 max_stack_use = max([func.max_stack_use for func in ls])
547 if (func.total_ops > ops_executed):
548 ops_executed = func.total_ops
549 routine_calls = routine_calls + func.call_count
550 print 'Total opcodes: %lu' % ops_executed
551 print 'Total routine calls: %lu' % routine_calls
552 print 'Max. stack usage: %li' % max_stack_use
554 print '%-35s %-10s %-10s %-10s %-4s' % ('Routine', 'Ops', 'Ops(+Subs)', 'Calls', 'Nest')
556 func.dump_dumbfrotz_style()
558 print 'Functions that consumed the most time (excluding children):'
559 ls = functions.values()
560 ls.sort(lambda x1, x2: cmp(x2.self_time, x1.self_time))