I'm not going to re-hash their posts, so you should go read them now, if you haven't done so before.
But who actually does either of those things? Well, Google Code search reports approximately 54,300 results for applications using addslashes() and approximately 100 applications which have the words "SET CHARACTER SET" in their code.
Not particularly many of the latter, but the very first result is the rather popular phpMyAdmin project, so its not a completely unused query.
Anyway, since I hadn't seen any research on which character sets were vulnerable (and which characters to use), I wrote a small fuzzer to test all the character sets which MySQL supports (other than UCS-2), for several different encoding attacks, though only the ones described by Chris and Ilia yielded any results. Here are the character sets vulnerable, and the first character of the multi-byte sequence where \ is the second character:
- big5, [A1-F9]
- sjis, [81-9F], [E0-FC]
- gbk, [81-FE]
- cp932, [81-9F], [E0-FC]
I didn't successfully test ucs2 because ucs2 is a fixed width 2 byte character encoding, which will not execute queries passed in standard ascii, and so it would be impossible to get a webapp working if you set your connections to ucs2 but didn't convert all your queries, etc, so a configuration issue like that would be instantly noticed, and if you were using an encoding unaware function, then it would definitely be vulnerable, since all byte sequences are two bytes.
Anyway, if anyone is interested, I uploaded the fuzzer here: http://mihd.net/pwe0f9. As you will notice, I ripped the code which Ilia used to illustrate the vulnerability for GBK, so the two pieces look very similar. Its also not very well written, but it worked and got the results for me.
Notes: Part 2 of the fuzzer is trying to see if its possible to have a double or single quote as the second character in a multi-byte sequence, and Part 3 is trying to see if its possible to use a quote as the first character.
Also, this is obviously MySQL specific, the reason for this is because (as far as I could find out) MySQL is the only RDBMS which allows you to set the connection encoding through a query, all the other require configuration changes, and while addslashes() issues are applicable to all RDBMS's, most applications these days use mysql_real_escape_string().