Tuesday, April 26, 2011

Whoosh Python Search Notes

Baisc Whoosh Search Walkthrough 


# Import required 
import os
from whoosh.index import create_in
from whoosh.fields import *

# Create a "Schema" object to store text to search
# You can create custom search fields such as "title", "domain", "path"
# and "content". 
schema = Schema(title=TEXT(stored=True), domain=ID(stored=True),
                              path=ID(stored=True), content=TEXT)

# Check to see if a search index directory exists; if not, create one
if not os.path.exists("search_test"):
    os.mkdir("search_test")

# Create a search index in the search index directory
ix = create_in("search_test", schema, indexname="testindex")
writer = ix.writer()

# Index some documents
writer.add_document(title=u"Document 1", domain=u"www.mysearch.com",
                         path=u"/mywebsite", content=u'Article')
writer.add_document(title=u"Document 2", domain=u"www.mysearch.com",
                         path=u"/mywebsite/2/article/1", content=u'Document a')
writer.commit()

# Running search queries
from whoosh.qparser import QueryParser
import whoosh.index as index
# Check to see if the given search index exists
if index.exists_in("search_test", indexname="testindex"):
    # open the search index
    ix = index.open_dir("search_test", indexname="testindex")
    # generate a search query based on some string.
    # Here is creates a query looking for "Document" in the "content"
    # search index field. 
    query = QueryParser("content", ix.schema).parse(u'*Document*')
    with ix.searcher() as searcher:  
        # do the actual search
        results = searcher.search(query)
        results

# Running search queries
from whoosh.qparser import QueryParser
import whoosh.index as index
if index.exists_in("search_test", indexname="testindex"):
    ix = index.open_dir("search_test", indexname="testindex"))
    query = QueryParser("domain", ix.schema).parse(u'www.mydomain.com')
    with ix.searcher() as searcher:
        results = searcher.search(query)
        results

# Running search queries
from whoosh.qparser import QueryParser, MultifieldParser
import whoosh.index as index
from whoosh.query import *
if index.exists_in("search_test", indexname="testindex"):
    ix = index.open_dir("search_test")
    query = MultifieldParser(["domain","title","content"], schema=ix.schema).parse(u"Document OR Article")
    with ix.searcher() as searcher:
        results = searcher.search(query)
        results

Different way to alter a Postgres table

Alter Postgres Tables via metadata 
(Thanks Louis R.) 

select atttypmod from pg_attribute where attrelid = 'TABLE1'::regclass and attname = 'title';
# then
update pg_attribute set atttypmod = 200+4 where attrelid = 'TABLE1'::regclass and attname = 'title';

http://sniptools.com/databases/resize-a-column-in-a-postgresql-table-without-changing-data

Thursday, April 21, 2011

Quick Setup Instructions for SVN Compile and Local Install

Steps to make a custom build of SVN and place it in non-system directory.

Make a place to store the new copy:
mkdir -p /home/myuser/customsvn/src/
cd /home/myuser/customsvn/src/

Download from Apache Subversion; place in local install src directory:
ls -la /home/myuser/customsvn/src/
subversion-1.6.xx.tar.bz2
subversion-deps-1.6.xx.tar.bz2

Untar subversion then subversion-deps:
tar -jxvf subversion-1.6.xx.tar.bz2
tar -jxvf subversion-deps-1.6.xx.tar.bz2

Configure Command (Configure without Apache stuff):
./configure --prefix=/home/myuser/customsvn/bin/svn/ --with-apxs=no --disable-mod-activation --with-serf=no --with-berkeley-db=no  | tee configure.jjj.log
make | tee make.jjj.log
make install | tee make_install.jjj.log

Tuesday, April 19, 2011

SVN Merging For Version 1.4-

Merging for svn 1.4- is a bit different than for newer versions.  
This is what I have found that works for merging a trunk to a branch:
svn merge \
  svn+ssh://svn.example.com/repo/myproject/branches/newbranch@HEAD \
  svn+ssh://svn.example.com/repo/myproject/trunk@HEAD \
  /sites/myproject/src/myproject-workingcopy


*NOTE: seems weird that the trunk is the argument after the branch, but my trunk changes weren't being merged in when I did it the other way.

It seems that with svn version pre 1.4, you must specify the version where your branch was last copied or merged from.  If your branch was created at version 234, then the following command would merge it. 
svn merge --dry-run \
  svn+ssh://svn.example.com/export/myproject/branches/newbranchh@234 \
  svn+ssh://svn.example.com/export/myproject/trunk/@HEAD .


http://www.svnforum.org/threads/38394-1.4-merge

Friday, April 15, 2011

Django Content Type Quick Reference

from django.contrib.contenttypes.models import ContentType

# Get the content type for a given model
ct = ContentType.objects.get(app_label="myapp", model="mymodel")
# or
ct = ContentType.objects.get_for_model(MyModel)

# Get the class associated with the content type
model_cls = ct.model_class()

# Filter the queryset as normal              
queryset = model_cls.objects.filter()

# Get a specific object instance given a content type
ct.get_object_for_this_type(username='Guido')

http://docs.djangoproject.com/en/dev/ref/contrib/contenttypes/

Wednesday, April 13, 2011

Cool Python Modules

String Similarity

google-diff-match-patch - The Diff Match and Patch libraries offer robust algorithms to perform the operations required for synchronizing plain text.

pylevenshtein - Levenshtein (edit) distance, and edit operations, string similarity...

difflib - classes and functions for comparing sequences
http://docs.python.org/library/difflib.html

Database Tools

uuid - Python built-in module for creating unique ids
http://docs.python.org/library/uuid.html

django-command-extensions: UUIDField -  uuid field distributed with django-comamnd-extensions.
http://code.google.com/p/django-command-extensions/

django-uuidfield - stand-alone Django UUIDField implementation.
https://github.com/dcramer/django-uuidfield

System Tools

Imp - get python internals
http://docs.python.org/library/imp.html

Django Tools

AutoSlugify - automate slug field
http://packages.python.org/django-autoslug/

Postgresql Sequences

Create a new sequence:
CREATE SEQUENCE mytable_myid_seq;

Add the sequence as the incrementor for a table column:
ALTER TABLE mytable
ALTER COLUMN myid
SET DEFAULT NEXTVAL('mytable_myid_seq');

Update existing rows in table with the sequence:
UPDATE mytable
SET myid = NEXTVAL('mytable_myid_seq');

Display the sequence information:
Select * from mytable_myid_seq;

Update a sequence to be one greater than a table's id column:
select setval('mytable_myid_seq', (select max(id) + 1 from mytable));
http://railspikes.com/2009/3/6/duplicate-key-violates-unique-constraint (thanks Louis)
http://pointbeing.net/weblog/2008/03/mysql-versus-postgresql-adding-an-auto-increment-column-to-a-table.html

Friday, April 01, 2011

Django News/Code At www.djangocurrent.com

I've created a more formal blog called www.djangocurrent.com, focused around Django.  I'll continue to post code snippets and information on this site, but will additionally post more formal write-ups at djangocurrent.com.  Check it out!