Update: How to create personal Thesaurus.com
Here is the code that parses OpenOffice thesaurus file into multiple files. I've been teaching myself Perl only for a month now so I have no idea how sloppy my code is.
#!/usr/bin/perl -w
use strict;
my $fname = 'th_en_US_v2.dat';
open(FILE1, $fname);
open(FILE2, ">$fname.0");
my $ch = 'a';
my $str = <FILE1>; # ignore first line
$str = <FILE1>;
do {
# if( $str =~ /^$ch*\|(.*?)\n/i ) {
if( $str =~ /^$ch.*\|(.*?)\n/i ) {
close(FILE2);
open(FILE2, ">$fname.$ch");
$ch++;
}
print FILE2 $str;
chomp($str);
my ($word, $count) = split('\|', $str);
for my $i ( 1..$count ) {
$str = <FILE1>;
print FILE2 $str;
}
} while( $str = <FILE1> );
close(FILE2);
close(FILE1);
No comments:
Post a Comment