ikiwiki/ todo/ done

recently fixed TODO items

support creole markup

creole is a wanna to be standard markup for all wikis.

It's an agreement archived by many wiki engine developers.

Currently MoinMoin and Oddmuse supports it. And a lot of wikis (dokuwiki, tiddlywiki, pmwiki. podwiki, etc) have partial support. More info on support: http://www.wikicreole.org/wiki/Engines

Some useful infomation:

And there is a perl module: Text::WikiCreole

Syntax file for vim: http://www.peter-hoffmann.com/code/vim/ (Since a typical ikiwiki user usually use external editors. :))

Should be pretty easy to add a plugin to do it using > Text::WikiCreole. --Joey

done

Posted Thu Jun 19 19:01:36 2008
quieten-bzr

The bzr plug echoes "added: somefile.mdwn" when it adds somefile.mdwn to the repository. As a result, the redirect performed after a new article is created fails because the bzr output comes before the HTTP headers.

The fix is simply to call bzr with the --quiet switch. Something like this applied to bzr.pm works for me:

46c46
<   my @cmdline = ("bzr", $config{srcdir}, "update");
---
>   my @cmdline = ("bzr", "update", "--quiet", $config{srcdir});
74c74
<   my @cmdline = ("bzr", "commit", "-m", $message, "--author", $user,
---
>   my @cmdline = ("bzr", "commit", "--quiet", "-m", $message, "--author", $user, 
86c86
<   my @cmdline = ("bzr", "add", "$config{srcdir}/$file");
---
>   my @cmdline = ("bzr", "add", "--quiet", "$config{srcdir}/$file");
94a95,97
>   eval q{use CGI 'escapeHTML'};
>   error($@) if $@;
>

done, although I left off the escapeHTML thing which seems to be in your patch by accident.

(Please use diff -u BTW..) --Joey

Posted Tue Jun 10 13:46:01 2008 Tags:
search terms

The search plugin could use xapian terms to allow some special searches. For example, "title:foo", or "link:somepage", or "author:foo", or "copyright:GPL".

Reference: http://xapian.org/docs/omega/termprefixes.html

done for title and link, which seem like the really useful ones.

Posted Tue Jun 10 13:46:01 2008
aggregate locking

The aggregate plugin's locking is a suboptimal.

There should be no need to lock the wiki while aggregating -- it's annoying that long aggregate runs can block edits from happening. However, not locking would present problems. One is, if an aggregate run is happening, and the feed is removed, it could continue adding pages for that feed. Those pages would then become orphaned, and stick around, since the feed that had created them is gone, and thus there's no indication that they should be removed.

To fix that, garbage collect any pages that were created by aggregation once their feed is gone.

Are there other things that could happen while it's aggregating that it should check for?

Well, things like the feed url etc could change, and it would have to merge in such changes before saving the aggregation state. New feeds could also be added, feeds could be moved from one source page to another.

Merging that feed info seems doable, just re-load the aggregation state from disk, and set the message, lastupdate, numposts, and error fields to their new values if the feed still exists.


Another part of the mess is that it needs to avoid stacking multiple aggregate processes up if aggregation is very slow. Currently this is done by taking the lock in nonblocking mode, and not aggregating if it's locked. This has various problems, for example a page edit at the right time can prevent aggregation from happening.

Adding another lock just for aggregation could solve this. Check that lock (in checkconfig) and exit if another aggregator holds it.


The other part of the mess is that it currently does aggregation in checkconfig, locking the wiki for that, and loading state, and then dropping the lock, unloading state, and letting the render happen. Which reloads state. That state reloading is tricky to do just right.

A simple fix: Move the aggregation to the new 'render' hook. Then state would be loaded, and there would be no reason to worry about aggregating.

Or aggregation could be kept in checkconfig, like so:

done

Posted Tue Jun 10 13:46:00 2008
different search engine

done, using xapian-omega! --Joey

After using it for a while, my feeling is that hyperestraier, as used in the search plugin, is not robust enough for ikiwiki. It doesn't upgrade well, and it has a habit of sig-11 on certain input from time to time.

So some other engine should be found and used instead.

Enrico had one that he was using for debtags stuff that looked pretty good. That was Xapian, which has perl bindings in libsearch-xapian-perl. The nice thing about xapian is that it does a ranked search so it understands what words are most important in a search. (So does Lucene..) Another nice thing is it supports "more documents like this one" kind of search. --Joey

xapian

I've invesitgated xapian briefly. I think a custom xapian indexer and use of the omega for cgi searches could work well for ikiwiki. --Joey

indexer

A custom indexer is needed because omindex isn't good enough for ikiwiki's needs for incremental rendering. (And because, since ikiwiki has page info in memory, it's silly to write it to disk and have omindex read it back.)

The indexer would run as a ikiwiki hook. It needs to be passed the page name, and the content. Which hook to use is an open question. Possibilities:

The hook would remove any html from the content, and index it. It would need to add the same document data that omindex would.

The indexer (and deleter) will need a way to figure out the ids in xapian of the documents to delete. One way is storing the id of each page in the ikiwiki index.

The other way would be adding a special term to the xapian db that can be used with replacedocumentbyterm/deletedocumentbyterm. Hmm, let's use a term named "P".

The hook should try to avoid re-indexing pages that have not changed since they were last indexed. One problem is that, if a page with an inline is built, every inlined item will get each hook run. And so a naive hook would index each of those items, even though none of them have necessarily changed. Date stamps are one possibility. Another would be to avoid having the hook not do any indexing when %preprocessing is set (Ikiwiki.pm would need to expose that variable.) Another approach would be to use a needsbuild hook and only index the pages that are being built.

cgi

The cgi hook would exec omega to handle the searching, much as is done with estseek in the current search plugin.

It would first set OMEGA_CONFIG_FILE=.ikiwiki/omega.conf ; that omega.conf would set database_dir=.ikiwiki/xapian and probably also set a custom template_dir, which would have modified templates branded for ikiwiki. So the actual xapian db would be in .ikiwiki/xapian/default/.

lucene

I've done a bit of prototyping on this. The current hip search library is Lucene. There's a Perl port called Plucene. Given that it's already packaged, as libplucene-perl, I assumed it would be a good starting point. I've written a very rough patch against IkiWiki/Plugin/search.pm to handle the indexing side (there's no facility to view the results yet, although I have a command-line interface working). That's below, and should apply to SVN trunk.

Of course, there are problems. ;-)

  • Plucene throws up a warning when running under Taint mode. There's a patch on the mailing list, but I haven't tried applying it yet. So for now you'll have to build IkiWiki with NOTAINT=1 make install.
  • If I kill ikiwiki while it's indexing, I can screw up Plucene's locks. I suspect that this will be an easy fix.

There is a C++ port of Lucene which is packaged as libclucene0. The Perl interface to this is called Lucene. This is supposed to be significantly faster, and presumably won't have the taint bug. The API is virtually the same, so it will be easy to switch over. I'd use this now, were it not for the lack of package. (I assume you won't want to make core functionality depend on installing a module from CPAN). I've never built a Debian package before, so I can either learn then try building this, or somebody else could do the honours. ;-)

If this seems a sensible approach, I'll write the CGI interface, and clean up the plugin. -- Ben

The weird thing about lucene is that these are all reimplmentations of it. Thank you java.. The C++ version seems like a better choice to me (packages are trivial). --Joey

Might I suggest renaming the "search" plugin to "hyperestraier", and then creating new search plugins for different engines? No reason to pick a single replacement. --JoshTriplett

Index: IkiWiki/Plugin/search.pm
===================================================================
--- IkiWiki/Plugin/search.pm    (revision 2755)
+++ IkiWiki/Plugin/search.pm    (working copy)
@@ -1,33 +1,55 @@
 #!/usr/bin/perl
-# hyperestraier search engine plugin
 package IkiWiki::Plugin::search;

 use warnings;
 use strict;
 use IkiWiki;

+use Plucene::Analysis::SimpleAnalyzer;
+use Plucene::Document;
+use Plucene::Document::Field;
+use Plucene::Index::Reader;
+use Plucene::Index::Writer;
+use Plucene::QueryParser;
+use Plucene::Search::HitCollector;
+use Plucene::Search::IndexSearcher;
+
+#TODO: Run the Plucene optimiser after a rebuild
+#TODO: CGI query interface
+
+my $PLUCENE_DIR;
+# $config{wikistatedir} may not be defined at this point, so we delay setting $PLUCENE_DIR
+# until a subroutine actually needs it.
+sub init () {
+  error("Plucene: Statedir <$config{wikistatedir}> does not exist!") 
+    unless -e $config{wikistatedir};
+  $PLUCENE_DIR = $config{wikistatedir}.'/plucene';  
+}
+
 sub import { #{{{
-       hook(type => "getopt", id => "hyperestraier",
-               call => \&getopt);
-       hook(type => "checkconfig", id => "hyperestraier",
+       hook(type => "checkconfig", id => "plucene",
                call => \&checkconfig);
-       hook(type => "pagetemplate", id => "hyperestraier",
-               call => \&pagetemplate);
-       hook(type => "delete", id => "hyperestraier",
+       hook(type => "delete", id => "plucene",
                call => \&delete);
-       hook(type => "change", id => "hyperestraier",
+       hook(type => "change", id => "plucene",
                call => \&change);
-       hook(type => "cgi", id => "hyperestraier",
-               call => \&cgi);
 } # }}}

-sub getopt () { #{{{
-        eval q{use Getopt::Long};
-       error($@) if $@;
-        Getopt::Long::Configure('pass_through');
-        GetOptions("estseek=s" => \$config{estseek});
-} #}}}

+sub writer {
+  init();
+  return Plucene::Index::Writer->new(
+      $PLUCENE_DIR, Plucene::Analysis::SimpleAnalyzer->new(), 
+      (-e "$PLUCENE_DIR/segments" ? 0 : 1));
+}
+
+#TODO: Better name for this function.
+sub src2rendered_abs (@) {
+  return map { Encode::encode_utf8($config{destdir}."/$_") } 
+    map { @{$renderedfiles{pagename($_)}} } 
+    grep { defined pagetype($_) } @_;
+}
+
 sub checkconfig () { #{{{
        foreach my $required (qw(url cgiurl)) {
                if (! length $config{$required}) {
@@ -36,112 +58,55 @@
        }
 } #}}}

-my $form;
-sub pagetemplate (@) { #{{{
-       my %params=@_;
-       my $page=$params{page};
-       my $template=$params{template};
+#my $form;
+#sub pagetemplate (@) { #{{{
+#      my %params=@_;
+#      my $page=$params{page};
+#      my $template=$params{template};
+#
+#      # Add search box to page header.
+#      if ($template->query(name => "searchform")) {
+#              if (! defined $form) {
+#                      my $searchform = template("searchform.tmpl", blind_cache => 1);
+#                      $searchform->param(searchaction => $config{cgiurl});
+#                      $form=$searchform->output;
+#              }
+#
+#              $template->param(searchform => $form);
+#      }
+#} #}}}

-       # Add search box to page header.
-       if ($template->query(name => "searchform")) {
-               if (! defined $form) {
-                       my $searchform = template("searchform.tmpl", blind_cache => 1);
-                       $searchform->param(searchaction => $config{cgiurl});
-                       $form=$searchform->output;
-               }
-
-               $template->param(searchform => $form);
-       }
-} #}}}
-
 sub delete (@) { #{{{
-       debug(gettext("cleaning hyperestraier search index"));
-       estcmd("purge -cl");
-       estcfg();
+       debug("Plucene: purging: ".join(',',@_));
+       init();
+  my $reader = Plucene::Index::Reader->open($PLUCENE_DIR);
+  my @files = src2rendered_abs(@_);
+  for (@files) {
+    $reader->delete_term( Plucene::Index::Term->new({ field => "id", text => $_ }));
+  }
+  $reader->close;
 } #}}}

 sub change (@) { #{{{
-       debug(gettext("updating hyperestraier search index"));
-       estcmd("gather -cm -bc -cl -sd",
-               map {
-                       Encode::encode_utf8($config{destdir}."/".$_)
-                               foreach @{$renderedfiles{pagename($_)}};
-               } @_
-       );
-       estcfg();
+       debug("Plucene: updating search index");
+  init();
+  #TODO: Do we want to index source or rendered files?
+  #TODO: Store author, tags, etc. in distinct fields; may need new API hook.
+  my @files = src2rendered_abs(@_);
+  my $writer = writer();    
+   
+  for my $file (@files) {
+    my $doc = Plucene::Document->new;
+    $doc->add(Plucene::Document::Field->Keyword(id => $file));
+    my $data;
+    eval { $data = readfile($file) };
+    if ($@) {
+      debug("Plucene: can't read <$file> - $@");
+      next;
+    }
+    debug("Plucene: indexing <$file> (".length($data).")");
+    $doc->add(Plucene::Document::Field->UnStored('text' => $data));
+    $writer->add_document($doc);
+  }
 } #}}}
-
-sub cgi ($) { #{{{
-       my $cgi=shift;
-
-       if (defined $cgi->param('phrase') || defined $cgi->param("navi")) {
-               # only works for GET requests
-               chdir("$config{wikistatedir}/hyperestraier") || error("chdir: $!");
-               exec("./".IkiWiki::basename($config{cgiurl})) || error("estseek.cgi failed");
-       }
-} #}}}
-
-my $configured=0;
-sub estcfg () { #{{{
-       return if $configured;
-       $configured=1;
-
-       my $estdir="$config{wikistatedir}/hyperestraier";
-       my $cgi=IkiWiki::basename($config{cgiurl});
-       $cgi=~s/\..*$//;
-
-       my $newfile="$estdir/$cgi.tmpl.new";
-       my $cleanup = sub { unlink($newfile) };
-       open(TEMPLATE, ">:utf8", $newfile) || error("open $newfile: $!", $cleanup);
-       print TEMPLATE IkiWiki::misctemplate("search", 
-               "\n\n\n\n\n\n",
-               baseurl => IkiWiki::dirname($config{cgiurl})."/") ||
-                       error("write $newfile: $!", $cleanup);
-       close TEMPLATE || error("save $newfile: $!", $cleanup);
-       rename($newfile, "$estdir/$cgi.tmpl") ||
-               error("rename $newfile: $!", $cleanup);
-
-       $newfile="$estdir/$cgi.conf";
-       open(TEMPLATE, ">$newfile") || error("open $newfile: $!", $cleanup);
-       my $template=template("estseek.conf");
-       eval q{use Cwd 'abs_path'};
-       $template->param(
-               index => $estdir,
-               tmplfile => "$estdir/$cgi.tmpl",
-               destdir => abs_path($config{destdir}),
-               url => $config{url},
-       );
-       print TEMPLATE $template->output || error("write $newfile: $!", $cleanup);
-       close TEMPLATE || error("save $newfile: $!", $cleanup);
-       rename($newfile, "$estdir/$cgi.conf") ||
-               error("rename $newfile: $!", $cleanup);
-
-       $cgi="$estdir/".IkiWiki::basename($config{cgiurl});
-       unlink($cgi);
-       my $estseek = defined $config{estseek} ? $config{estseek} : '/usr/lib/estraier/estseek.cgi';
-       symlink($estseek, $cgi) || error("symlink $estseek $cgi: $!");
-} # }}}
-
-sub estcmd ($;@) { #{{{
-       my @params=split(' ', shift);
-       push @params, "-cl", "$config{wikistatedir}/hyperestraier";
-       if (@_) {
-               push @params, "-";
-       }
-
-       my $pid=open(CHILD, "|-");
-       if ($pid) {
-               # parent
-               foreach (@_) {
-                       print CHILD "$_\n";
-               }
-               close(CHILD) || print STDERR "estcmd @params exited nonzero: $?\n";
-       }
-       else {
-               # child
-               open(STDOUT, "/dev/null"); # shut it up (closing won't work)
-               exec("estcmd", @params) || error("can't run estcmd");
-       }
-} #}}}
-
-1
+1;
Posted Tue Jun 10 13:46:00 2008
Support/Switch to MultiMarkdown

Supporting or switching to MultiMarkdown would take care of a few of the outstanding feature requests. Quoting from the MultiMarkdown site:

MultiMarkdown is a modification of John Gruber's original Markdown.pl file. It uses the same basic syntax, with several additions:

  1. I have added a basic metadata feature, to allow the inclusion of metadata within a document that can be used in different ways based on the output format.

  2. I have allowed the automatic use of cross-references within a Markdown document. For instance, you can easily jump to [the Introduction][Introduction].

  3. I have incorporated John's proposed syntax for footnotes. Since he has not determined the output format, I created my own. Mainly, I wanted to be able to add footnotes to the LaTeX output; I was less concerned with the XHTML formatting.

  4. Most importantly, however, I have changed the way that the processed output is created, so that it is quite simple to export Markdown syntax to a variety of outputs. By setting the Format metadata to complete, you generate a well-formed XHTML page. You can then use XSLT to convert to virtually any format you like.

MultiMarkdown would solve the BibTex request and the multiple output formats would make the print_link request an easy fix. MultiMarkdown is actively developed and can be found at:

MultiMarkdown Homepage

I don't think MultiMarkdown solves the BibTeX request, but it might solve the request for LaTeX output. --JoshTriplett

Unless there's a way to disable a zillion of the features, please no. Do not switch to it. One thing that I like about markdown as opposed to most other ASCII markup languages, is that it has at least a bit of moderation on the syntax (although it could be even simpler). There's not a yet another reserved character lurking behind every corner. Not so in multimarkdown anymore. Footnotes, bibliography and internal references I could use, and they do not add any complex syntax: it's all inside the already reserved sequences of bracketed stuff. (If you can even say that ASCII markup languages have reserved sequences, as they randomly decide to interpret stuff, never actually failing on illegal input, like a proper language to write any serious documentation in, would do.) But tables, math, and so on, no thanks! Too much syntax! Syntax overload! Bzzzt! I don't want mischievous syntaxes lurking behind every corner, out to get me. --tuomov

ikiwiki already supports MultiMarkdown, since it has the same API as MarkDown. So if you install it as Markdown.pm (or as /usr/bin/markdown), it should Just Work. It would also be easy to support some other extension such as mmdwn to use multimarkdown installed as MuliMarkdown.pm, if someone wanted to do that for some reason -- just copy the mdwn plugin and lightly modify. --Joey

There's now a multimarkdown setup file option that uses Text::MultiMarkdown for .mdwn files. done --Joey

Posted Tue Jun 10 13:46:00 2008
configurable timezones

It would be nice if the sure could set the timezone of the wiki, and have ikiwiki render the pages with that timezone.

This is nice for shared hosting, and other situation where the user doesn't have control over the server timezone.

done via the ENV setting in the setup file. --Joey

Posted Tue Jun 10 13:46:00 2008
minor adjustment to setup documentation for recentchanges feeds

Expand a comment so you know which bit to uncomment if you want to turn on feeds for recentchanges.

diff --git a/doc/ikiwiki.setup b/doc/ikiwiki.setup
index 99c81cf..7ca7687 100644
--- a/doc/ikiwiki.setup
+++ b/doc/ikiwiki.setup
@@ -91,9 +91,9 @@ use IkiWiki::Setup::Standard {
                #},
        ],

-       # Default to generating rss feeds for blogs?
+       # Default to generating rss feeds for blogs/recentchanges?
        #rss => 1,
-       # Default to generating atom feeds for blogs?
+       # Default to generating atom feeds for blogs/recentchanges?
        #atom => 1,
        # Allow generating feeds even if not generated by default?
        #allowrss => 1,

Hmm, recentchanges is just a blog. Of course the word "blog" is perhaps being used in too broad a sense here, since it tends to imply personal opinions, commentary, not-a-journalist, sitting-in-ones-underwear-typing, and lots of other fairly silly stuff. But I don't know of a better word w/o all these connotations. I've reworded it to not use the term "blog".. done --Joey

Posted Tue Jun 10 13:46:00 2008 Tags:
CGI method to pull/refresh

In some situations, it makes sense to have the repository in use by ikiwiki reside on a different machine. In that case, one could juggle SSH keys for the post-update hook. A better way may be to provide a different do parameter handler for the CGI, which would pull new commits to the working clone and refresh the wiki. Then, the remote post-update hook could just wget that URL. To prevent simple DoS attacks, one might assign a simple password.

done via the pinger and pingee plugins --Joey

Posted Tue Jun 10 13:45:59 2008 Tags:
shortcut with different link text

I'd like the ability to use a shortcut, but declare an explicit link text rather than using the link text defined on shortcuts. For example, if I create a shortcut protogit pointing to files in the xcb/proto.git gitweb repository, I don't always want to use the path to the file as the link text; I would like to src/xcb.xsd, but use the link text "XML Schema for the X Window System protocol". --JoshTriplett

If I understand you correctly, you can use Markdown [your link text](the path or URL) . Using your example: XML Schema for the X Window System protocol

If I don't understand this, can you give an HTML example? --JeremyReed

The problem is like that in shortcuts don't escape from Markdown. We would like to use the shortcuts plugin but add a descriptive text -- in this case [[xcbgit src/xcb.xsd|XML Schema...]] The file src/xcb.xsd could be any url, and the point of shortcuts is that you get to shorten it. --Ethan

Some clarifications: You can always write something like [XML Schema for the X Window System Protocol](http://gitweb.freedesktop.org/?p=xcb/proto.git;a=blob;hb=HEAD;f=src/xcb.xsd) to get XML Schema for the X Window System Protocol. However, I want to define a shortcut to save the typing. If I define something like protogit pointing to http://gitweb.freedesktop.org/?p=xcb/proto.git;a=blob;hb=HEAD;f=%s, then I can write [[protogit src/xcb.xsd]]; however, I then can't change the link text to anything other than what the shortcut defines as the link text. I want to write something like [[XML Schema for the X Window System Protocol|protogit src/xcb.xsd]], just as I would write a wikilink like [[the_shortcuts_on_this_wiki|shortcuts]] to get the shortcuts on this wiki. (The order you suggest, with the preprocessor directive first, seems quite confusing since wikilinks work the other way around.) --JoshTriplett

How about [xcbgit XML_Schema|src/xcb.xsd]. That's the same way round as a wikilink, if you look at it the right way. The syntax Josh suggests is not currently possible in ikiwiki.

However.. Short wikilinks has some similar objectives in a way, and over there a similar syntax to what Josh proposes was suggested. So maybe I should modify how ikiwiki preprocessors work to make it doable. Although, I seem to have come up with a clear alternative syntax over there. --Joey


One possible alternative, would be a general [[url ]] scheme for all kinds of links. As mentioned in Short wikilinks, I have wanted a way to enter links to the wiki with markdown-style references, specifying the actual target elsewhere from the text, with just a short reference in the text. To facilitate automatic conversion from earlier (already markdownised) "blog", I finally ended up writing a custom plugin that simply gets the location of wikipage, and use markdown mechanisms:

Here [is][1] a link.

  [1]: [[l a_page_in_the_wiki]]

Obviously [this]([[l another_page]]) also works, although the syntax is quite cumbersome.

So that the 'l' plugin inserts the location the page there, and markdown does the rest. My plugin currently fails if it can't find the page, as that is sufficient for my needs. Differing colouring for non-existing pages is not doable in a straightforward manner with this approach.

For external links, that is no concern, however. So you could define for each shortcut an alternative directive, that inserts the URL. Perhaps [[url shortcutname params]] or \[[@shortcutname params]] (if the preprocessor supported the @), and this could be extended to local links in an obvious manner: [[url page]] or [[@page]]. Now, if you could just get rid off the parantheses for markdown, for the short inline links --tuomov (who'd really rather not have two separate linking mechanisms: ikiwiki's heavy syntax and markdown's lighter one).


I've added code to make the [[foo 123]] syntax accept a desc parameter. I've named it like this to signal that it overrides the desc provided at description time. %s is expanded here as well.

done -- Adeodato Simó

Posted Sun Jun 8 00:14:52 2008