Automatically creating MySQL FULLTEXT indexes with django syncdb

Solr's great, and I depend upon it for a lot, but sometimes it's just an extra moving piece that's not worth the deployment overhead. I'm working on a project where a simple MySQL FULLTEXT index will probably suffice instead, since we're already using MySQL.

The trick, though, is that Django doesn't create FULLTEXT indexes for you. Which means you have to drop down into the mysql shell and create your index by hand, or script it and remember to run the script, etc., etc. Or you could write a custom command, but you still have to remember to run that separately. But who remembers anything anymore?

I wanted this to be automatic, so that with any syncdb, it would ensure that the fulltext index is present without any extra step. To do this, the best option I've come up with is to use django's post-syncdb signal. Unfortunately that doc is in a state of flux, so I had to poke around at it all to get this straight, and I'm not sure I have it right yet, but I have it working, at least. Here's the script, which lives in the "management" module of the app's directory next to the with the model I want the fulltext index on.

This is using django's svn trunk, updated today.

  1. from django.db import connection
  2. from django.db.models import signals
  4. from myproj.myapp import models as myapp_models
  7. def create_fulltext_indexes(**kwargs):
  8. """
  9. Check whether myapp_searchindex has a fulltext index,
  10. and create it if it doesn't yet exist.
  11. """
  12. cursor = connection.cursor()
  13. try:
  14. cursor.execute("""
  15. SELECT * FROM information_schema.statistics
  16. WHERE table_name='myapp_searchindex'
  17. AND index_name='str_value_fulltext'
  18. """)
  19. rows = cursor.fetchall()
  20. if len(rows) == 0:
  21. print 'Creating fulltext index on myapp_searchindex.str_value'
  22. cursor.execute("""
  23. CREATE FULLTEXT INDEX str_value_fulltext
  24. ON myapp_searchindex (str_value)
  25. """)
  26. except:
  27. import traceback
  28. print traceback.print_exc()
  31. signals.post_syncdb.connect(create_fulltext_indexes, sender=myapp_models)

The key thing here is that the post_syncdb signal seems to get sent to this function more than once. I read some comment threads on this and used that advice to get the number of signal calls down from several to two, but I didn't want the index to get built and rebuilt over and over, so this checks MySQL 5.0's information_schema database to see if the index already exists or not before building it.

It seems likely that there's a better way to do this, but I couldn't find an easier solution that didn't require any kind of separate step. If you know one, please let me know! If not, hopefully this will help somebody else.

Update (8/15): This is the same strategy this code uses to swap out MySQL storage engines.


good work. cool.

good work. cool.