Posted by hank,
Sat Sep 15 18:17:00 UTC 2007
So, earlier this year, I decided it would be cool to mirror Wikipedia. So, I successfully set up a local copy on my system, and it’s been just sitting there ever since. But lately, I’ve been interested in fulltext indexing offered by various indexing engines, and Sphinx has looked especially tasty. So, I figured I’d sit down and try it today.
I pointed it at my 16GB of Wikipedia text in my MySQL database.
Continue Reading
Tags: sql
Posted by hank,
Fri May 11 04:22:00 UTC 2007
I was messing around writing some sweet SQL statements for amarok tonight. You can either run them using the MySQL console or using dcop (google ‘amarok dcop’). Here’s some examples:
# List artists and their average rating and number of ratings ordered by favorite artists first
SELECT a.name, avg(s.rating) avg, COUNT(s.rating) count FROM tags t, artist a, statistics s WHERE a.id=t.artist AND t.url=s.url GROUP BY a.name HAVING count > 10 ORDER BY avg DESC;
Tags: sql
Posted by hank,
Sat Feb 24 08:07:00 UTC 2007
So, a problem with MySQL (in my opinion) is that it is not case sensitive by default for VARCHAR fields. That makes getting rid of crappy entries like ’ITALY’ a bother. I mean, sure, I could just post-process it with ruby (see titleize), but what’s the fun in that.
select distinct BINARY(lead_country) from countries;
Ah, finally recognizes that ITALY is not Italy. One is definitely uglier than the other one. Now for the change.
update countries set lead_country = 'Italy' where lead_country = "ITALY";
Much better.
Tags: sql