nsCOMPtr has never been so pretty

Jim Blandy announced his archer-mozilla pretty-printers for Spidermonkey late last year. I’ve used them a few times while working on some JS proxy bugs, and I’ve found them to be invaluable. So invaluable, in fact, that I’ve written a bunch of pretty-printers for some pain points outside of js/. If this prospect excites you so much that you can’t be bothered to read the remaining examples, you can acquire the code from its hg repo. Please note, you’ll need to be using gdb >7.2 (meaning probably trunk at the time of writing) due to dependencies on the dynamic_type feature which was recently added.

What’s supported right now:

nsCOMPtr<nsIFoo>

If you print an nsCOMPtr, you’ll see one of two things:
[(nsIFoo*) 0x0]
or
[(nsConcreteFoo*) 0x12345678]

In addition, pretty-printers chain, so if there’s a pretty printer for nsConcreteFoo*, you’ll see that printed instead of just the pointer value. Furthermore, nsRefPtr and nsAutoPtr are also supported, and maybe nsWeakPtr (untested).

nsCString/nsString

It prints the string inline, up to the limit specified by gdb. This probably only works for ascii strings at the moment; the code is a bit of a mess.

nsTArray/InfallibleTArray

You’ll get output along the lines of [0x12345678, 0x12345678] at the moment. Like the smart pointers, each printed element will be pretty-printed if possible.

nsStandardURL*

You’ll see the spec printed. This works wonderfully when you’re looking at an object with nsCOMPtr<nsIURI> foo, where foo is actually pointing to an nsStandardURL object.

Tagged reference counts

You know in cycle-collected objects when the refcount shows you mTagged = 0x8? Yeah, not any more.

PRPackedBool

I found the existing display of mBoolVal = 1 '\001' to be silly.

What’s coming up?

I’ve got a list of things that still need fixing/improving, and I’m open to suggestions/patches.

  • Auto arrays – the kinda-sorta work right now, but it looks gross.
  • COM/void/observer arrays – don’t work at all
  • Hashtables – don’t work at all right now
  • Stricter regexes for matching types – things like pointers to arrays are matched by the nsTArray printer right now
  • Auto strings – I don’t remember if they don’t work at all or semi-work
  • PRUnichar* – doesn’t work.

and so on and so forth. It’s really not difficult to write more printers, and it improves the output of gdb immensely in my opinion. Give it a shot! The instructions for how to set this up can be found in the README file. You’ll need a version of gdb > 7.2 (probably trunk, at the time of writing), since my printers rely on gdb.Type.dynamic_type.

Self-serve tools: now more likely to work

If you’ve given my self-serve tools a try (in particular, cancel.py) and had it claim that it couldn’t authorize you, it’s time to give it another shot. Steven Fink, being the wonderful person he is, dove in and found some weirdness going on with my usage of urllib and the self-serve API performing redirections. The end result is that the tool is significantly more likely to work for you, and if it doesn’t, I am really interested in seeing the output of python cancel.py -d, which makes urllib be significantly more verbose about what it’s sending and receiving. Good hunting!

Build smarter, not harder

I spent the past six weeks roaming around Europe with a netbook, and used that time productively to get some work done on Firefox. Part of that involved building on Windows for the first time, and experiencing the joy of pymake. However, I found the extra characters required to fire off incremental builds with pymake pushed me just past the pain point required to get me to write some sort of automation. With that in mind, I introduce smartmake, a tool to allow you to specify as little information as possible to build incrementally while still ending up with a working finished build.

Here’s how it works: I’ve encoded some basic dependencies into a python file (any changes to layout/build, netwerk/build, js/src, etc. require a rebuild of toolkit/library, any changes to layout/ or dom/ require rebuilding layout/build, etc). You pass a list of srcdir directories that have been changed to the script, and it prints out a list of of build commands joined by &&, ie:

$ smartmake ipc/ipdl dom/src/geolocation
make -C objdir/ipc/ipdl && make -C objdir/dom/src/geolocation && make -C objdir/layout/build && make -C objdir/toolkit/library

It’s a pretty dumb tool at the moment, and there are certainly lots of edge cases that don’t actually work correctly. However, I found it useful enough in Europe that when I returned home to a different machine, I missed it. It’s hardcoded for my own setup right now, but I’ll try to make it more general if people are interested (ie. the objdir and cmd variables could be read from a config file). Hit me up with any requests or if you have wonderful ideas for how to improve it.

Update: I’ve genericized it a small amount. Now, you’ll need to create .hg/.smartmake like so:
[smartmake]
objdir: objdir/
cmd: make -C

before the updated smartmake will allow you to continue. Notice that this is per tree, not global. Furthermore, smartmake.py is designed to be used as a tool that outputs a command line – I have a shell alias that pipes its output to sh.

JS runnables: now with less boilerplate

Actually, this little trick has been possible for at least a year and a half since I fixed the enhancement request, but I don’t believe it’s common knowledge. When writing something like
someEventTarget.dispatch({ run: function() { ... }), you can simply use someEventTarget.dispatch(function() { ... }) and skip the object goop. It looks cleaner to my eyes, so I thought I’d try to get the message out.

Cancelling builds from the console, now easier than ever!

The self-serve tools, specifically cancel.py has received some important usability upgrades at the urging of jst and ehsan. Now, simply running

python cancel.py

will be enough to get you going – you’ll be prompted for your username, password, branch and hg revision. The builds displayed also show their state (running, pending or complete) as well, so it’s easier to find what you’re looking for. Let me know if there are more changes you’d like made!

How to identify expected and unexpected crashes in tinderbox logs

I’ve seen this come up in several bugs recently, and it’s time to disseminate some knowledge. Here is what an unexpected crash usually looks like:

TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_seek.html | application timed out after 330 seconds with no output
PROCESS-CRASH | /tests/content/media/test/test_seek.html | application crashed (minidump found)

You’re looking at a test harness timeout because of a crash. Simple, easy to recognize. Even shows the crashing test for you!

PROCESS-CRASH | Main app process exited normally | application crashed (minidump found)

Here’s a sly one. This is a crash, but the lack of a test name and the “Main app process exited normally” actually means that a subprocess crashed intentionally. We’ve got tests that run scripts that cause crashes in child processes so that we can test the recovery behaviour in the parent, but unfortunately we display that information very well right now.

If you’re in doubt as to what kind of crash you’re seeing in the log, there’s another heuristic you can apply by looking at the crashing stack:

Crash reason: SIGSEGV
Crash address: 0x8

Thread 0 (crashed)
0 libxul.so!js::ctypes::ConvertToJS [typedefs.h:a538db9ab619 : 113 + 0x5]
rbx = 0x00000008 r12 = 0xa7833800 r13 = 0x00000000 r14 = 0x00000000
r15 = 0x00000000 rip = 0xb69142f5 rsp = 0xf25f7490 rbp = 0xa87a4690
Found by: given as instruction pointer in context
1 libxul.so!js::ctypes::PointerType::ContentsGetter [CTypes.cpp:a538db9ab619 : 3393 + 0x1b]
rbx = 0xa7833800 r12 = 0xa87a4690 r13 = 0xf25f74e8 r14 = 0xffffffff
r15 = 0xf25f7ae0 rip = 0xb69173cf rsp = 0xf25f74e0 rbp = 0xa877e750
Found by: call frame info

This is an intentional crash. We use jsctypes to dereference 0x8, an invalid address, and this is what it looks like every single time. If you don’t see this stack, you’re looking at a crash that should be filed.

So, to summarize: not every crash is unexpected. Keep your wits about you; know your crash stacks.

Self-serve, now in bulk

Update: the tool is now easier to use and doesn’t require adding your password as an argument. See this post for more details.

I’m a big fan of the self-serve tool that RelEng provided for people with LDAP access. When I can see a try build going bad, I can cancel all the remaining builds and free up resources, or retrigger completed builds if I want to get extra results. Unfortunately, the server is fairly slow to respond and the UI to perform these actions is clumsy. Luckily, there’s a really simple API available to allow anyone with access to make use of these tools in more traditional (read: non-graphical) means. Allow me to introduce you to a new repo I set up today to make working with the self-serve API easier – self-serve tools. selfserve.py contains simple wrappers for every API point exposed, and some basic documentation of the values returned by most of the calls. cancel.py is an example of a really simple tool that can be built on top of the wrappers to allow for bulk cancellation. Here’s what a session looks like:

[jdm@Phaethon self-serve-tools]$ python cancel.py -u "jmatthews@mozilla.com" -p my5ecureP4ssword123 -r 306838f27b33 -b try
1: Linux x86-64 tryserver leak test build
2: Linux tryserver build
3: OS X 10.6.2 tryserver build
4: WINNT 5.2 tryserver build
5: Maemo 5 QT tryserver build
6: OS X 10.6.2 tryserver leak test build
7: Maemo 5 GTK tryserver build
8: Android R7 tryserver build
9: Linux tryserver leak test build
10: WINNT 5.2 tryserver leak test build
11: OS X 10.5.2 tryserver leak test build
12: Linux QT tryserver build
13: Linux x86-64 tryserver build
14: all
Builds to cancel: 1 3 5
Cancelling Linux x86-64 tryserver leak test build
Cancelling OS X 10.6.2 tryserver build
Cancelling Maemo 5 QT tryserver build

This is just the first cut, but I’m excited not to have to use the web interface any more. Please feel free to add further documentation, or even new tools! I’m excited to see what other people can build with this.

nsITimer anti-pattern


Warning: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead in /home/public/blog/wp-content/plugins/deans_code_highlighter/geshi.php on line 2147

Warning: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead in /home/public/blog/wp-content/plugins/deans_code_highlighter/geshi.php on line 2147

Warning: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead in /home/public/blog/wp-content/plugins/deans_code_highlighter/geshi.php on line 2147

Warning: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead in /home/public/blog/wp-content/plugins/deans_code_highlighter/geshi.php on line 2147

Warning: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead in /home/public/blog/wp-content/plugins/deans_code_highlighter/geshi.php on line 2147

Warning: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead in /home/public/blog/wp-content/plugins/deans_code_highlighter/geshi.php on line 2147

Warning: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead in /home/public/blog/wp-content/plugins/deans_code_highlighter/geshi.php on line 2147

I’ve filed bug 640629 to address an intermittent source of orange: incorrect nsITimer creation. I first ran across it while working on making httpd.js collect garbage more frequently, a task which quickly turned into orange whack-a-mole as more and more problematic test constructs popped out of the nether. Mounir Lamouri (volkmar) recently fixed another instance of the nsITimer problem, so I thought I’d address it in public and do some education.

When you see a construct like this, you should be wary:

span class=”st0″>"@mozilla.org/timer;1"

There’s a common misconception that timers retain an extra reference that is released after they fire. This is false. If a timer is created and stored in a locally-scoped variable and the scope is exited, the timer is at risk of being garbage-collected before the timer fires. To combat this, store a reference to the timer elsewhere – a member of an object that outlives the current scope, a global variable, it doesn’t matter. Do your part – save a timer’s life today.

Knowledge++

Nine days ago, I made an off-hand remark in #content that I might be able to get the geolocation service working in Fennectrolysis by the end of the day if my plans worked out. I also remember referring to the process as “not a big deal.” Since that moment, I have put in a significant amount of work (at least several hours every day), and learned:

  • My estimating skills are severely underdeveloped
  • How to make use of the cycle collector
  • How weak references work
  • Best practices for XPCOM reference counting
  • There’s a confusing thing called nsIClassInfo which I should learn more about, but I know enough to force it to do my bidding for now
  • How non-modal prompts work
  • The meaning of obscure GCC linker errors like “undefined reference to vtable”
  • How to implement an XPCOM object in Javascript
  • Implementing XPCOM objects in Javascript frequently results in much more pleasant code than C++

Having said all that, yesterday I got the Fennec geolocation permission prompt to appear when triggered by a content page, and the proper callback was called when I allowed or canceled the request, so I’m confident that I can have a patch up for review by the end of the holiday weekend. Of course, given my track record, that means it might be up by the end of the week.

I’ve seen the future, brother: it is dynamic additions to the status bar that don’t block the main process.

You’re looking at a mind-bogglingly alpha Jetpack prototype running out of process. Yesterday was a black triangle moment for me, as I finally saw the culmination of 2.5 months of work to make the words “Gmail it” appear in the status bar.

In this implementation, when a Jetpack tries to do something that doesn’t really make sense in its own process (say, adding an element to the status bar), it proxies this operation to the chrome process and continues on its way. Theoretically this allows the main chrome to focus on important things like being responsive or not freezing, so the main work of running Jetpack scripts can be delegated to another process.

There’s lots and lots more work from here (for example, clicking “Gmail it” does nothing for various reasons I need to explore), but this inauspicious screenshot demonstrates that the out of process future is alive and kicking!