TeXtalk: an interview with Nicola Talbot

2013-06-08 by . 2 comments

Post to Twitter

textalkwizard

Welcome to the TeXtalk! We have a very special guest for today’s interview: our friend Nicola Talbot, 3k+ rep, 20+ badges, 55+ answers, package writer, and author of a great series of TeXnical books! Get ready for this awesome interview!

Paulo Cereda

Dear friends, welcome to the TeXtalk! Our interviewee today is Nicola Talbot! :)

Could you tell us a bit about yourself? :)

Nicola Talbot

I have a BSc in Maths and a PhD in electronic engineering and a diploma in creative writing. I’m also a computer programmer and have done contract research work, but I’m now an independent writer/publisher.

I’m also an honorary lecturer at the University of East Anglia (UEA).

And have written some LaTeX class files and packages on CTAN.

Paulo

Any hobbies, besides of course TeXing? :)

Nicola

Reading. (Raymond Chandler, Dashiell Hammett and Tolkien mostly.)

user4035

What are your lectures about? Math or Electrical engineering?

Nicola

I rarely lecture. “Honorary lecturer” is a bit of a misnomer. I do some machine learning research work and the occasional LaTeX tutorials and write class files for exams and problem sheets.

Paulo

How was your first contact to TeX, LaTeX and friends?

Nicola

So long ago, I’m not sure I can remember! :-) It was back when I was a student. I used to use a very primitive word processor on an Acorn Electron (cassette tape storage originally then I moved up-market and bought an external floppy drive!) I moved to an Acorn Archimedes installed ArmTeX and started using LaTeX to write stories and my PhD thesis.

(That was LaTeX2.09 back then.)

Charles Stewart

You are a production editor at Microtome Publishing, which publishes some AI/CogSci journals. I’m hugely interested, I’ll restrict myself to just two questions: What range of document representations do you have to handle? Do you have to glue together mixed LaTeX/PDF submissions and if so, how does that work out?

Generally, I’m very interested in details about how small-scale publishers manage to work with authors and LaTeX (or ConTeXt) – feel free to ramble.

Nicola

Yes, I’m the production editor for the Challenges in Machine Learning (CiML) series published by Microtome. Each volume contains a number of articles that were previously published in the Journal for Machine Learning Workshop and Conference Proceedings (JMLR W&CP) plus unpublished articles in appendices. Each CiML book is produced as a B&W hardcopy and a colour hyperlinked PDF. The original idea was just to import all the individual PDFs, but there were a number of problems with that.

The main issues were the hyperlinks and bookmark generation, but also the CiML books have a different page size to the original JMLR W&CP articles. For aesthetic reasons, I also wanted all the articles to use the same fonts etc., rather than having some in Computer Modern, some using Times or whatever the authors thought looked nice. There were also some that had been written in Word that had been converted to PDF, and they stood out like a sore thumb.

So I decided in the end that the best solution (in terms of the best result, rather than the easiest method) was to import all the articles as LaTeX files rather than PDFs. This meant trying to get the combine class to play nicely with hyperref, which wasn’t easy. Both are problematic enough on their own, and together they break. So I ended up writing two new classes: jmlr (for the articles) and jmlrbook (for the book).

Together they manage to get both combine and hyperref to work together. LaTeX gathers things like the title and author list from the imported papers and adds them to the TOC on the next pass, and there are also commands for referencing the articles, for example, in the foreword, which helps to automate the process. I wrote a Perl script to run pdfLaTeX and BibTeX, but later changed this to a Java GUI that has some diagnostic tools.

The combine+hyperref alliance is fairly fragile and there are some packages that break the book (e.g. subfig), so I added checks for known problem packages into the jmlr class to deter authors from using them and provided commands, such as \subfigure, to reduce the number of packages that need loading.

The main problems come when authors do something weird or send LaTeX source that doesn’t compile. I suppose they must use nonstopmode and don’t bother to check for errors, but this interrupts the build process and I have to fix every error.

This is extremely annoying when it’s something as trivial as a double-subscript error. Authors can easily fix that, but if the document has 50 double-subscript errors, I have to go through the document and fix every one of them to make the build work (and also I need to check if any of the reported errors are actually serious).

Each article is scoped when it is imported, so there shouldn’t be any conflict if different articles happen to define the same command, but sometimes authors use \gdef instead of \newcommand and that’s caused serious problems when, for example, the author has decided to globally redefine an accent command that gets used in a later article.

Encoding is another problem. One article may use utf8 while another may use latin1, but that’s not as problematic as the encoding switching within the same file, which I often see in bib files.

For the articles supplied in Word, I originally thought about Word to LaTeX converters, but I very soon discarded that idea. I don’t want to produce a close match to the original article, I want to produce the article in a format consistent with the rest of the document.

The only way to do this is to go through the Word document, copy and paste a paragraph into a TeX file, add in the necessary formatting commands, convert any strange characters (or occasionally delete unwanted control characters) and then move onto the next paragraph.

Okay, I think that’s the end of my ramble.

Charles

Thanks for your very detailed answer!

Paulo

How did you come up with the idea of writing “LaTeX for Complete Novices” and “Using LaTeX to Write a PhD Thesis”?

Nicola

They developed from the LaTeX tutorials I used to teach at UEA. I thought it would be useful to have extra materials the students could go to.

The tutorials were usually only around 2 or 3 hours for just 2 sessions, so it’s difficult to cover much.

Paulo

Both books received a very positive feedback. Is it challenging to keep the contents updated to the latest packages and tools? :)

Nicola

:-) Yes, it is. It took me a long time before I got around to updating the thesis one.

user4035

Did you prepare any problems in Latex for the students, like math problems? So the people could solve them and get more experience.

Nicola

Yes, my tutorials were usually in a computer lab where I explained a topic and then got them to try it out.

Paulo

could you tell us about your children’s books? :)

Nicola

:-) I’ve published two illustrated children’s books with my Dickimaw Books imprint. Both were produced using LaTeX and my flowfram package. The first one, The Foolish Hedgehog, was developed from an assignment I had to do on a creative writing course and the second one, Quack, Quack, Quack. Give My Hat Back! was inspired by Paulo’s duck and arara :-)

Paulo

I should’ve copyrighted my duck :P

Nicola

:-)

Paulo

you once mentioned in the chatroom about reading your stories to children. How was the experience? :)

Nicola

It was very interesting. Magdalene (my illustrator) and I went to Heartsease Primary School in Norwich. I read “The Foolish Hedgehog” to the younger children and we answered their questions, and then we had a Q&A session with the older children. They asked at one point how I created my books, but trying to explain LaTeX to them was a little difficult! The kids were great and wrote us some very nice thank you letters.

Paulo

How lovely! :)

Nicola

We have another book reading in June in Kings Lynn and I’m also doing a book reading at my niece’s nursery in Kent.

Paulo

Yay! :)

Nicola

We’ve also got a stall at our church’s country and craft fayre next month, so we’re going to have a very busy June!

Paulo

Hopefully with ice cream. :)

Nicola

Yay! As long as the weather’s good!

Paulo

CTAN lists several of your packages. Could you tell us a bit about your first package? :)

Nicola

My first package was datetime. When I started using LaTeX2.09, I used the ukdate package to format \today in the UK rather than US format, but it dropped off the TeX distributions (presumably because of a licensing thing) so I wrote a package to replace it. That’s why datetime defaults to UK format, as that was its original purpose.

I later forked off the ordinal/number string code into a separate package (fmtcount).

user4035

I used this package. Nice to know, that you are the author.

Nicola

:-)

Paulo

datatool and glossaries are also very famous! Could you tell us about them? :)

Nicola

The original glossary package actually started out as an example of how to do a glossary in a tutorial. The LaTeX books back then had little to say about glossaries except maybe a brief mention of \glossary. The glossary package then sort of evolved and before I knew it, it had become an unwieldy leviathan that I couldn’t control! So in the end I decided to rewrite it from scratch, which is how glossaries came about.

It’s a similar sort of thing with datatool. I wrote the original csvtools for mail merging, but I wasn’t happy with the underlying design and decided it was too complicated to maintain, so I rewrote it as datatool. The first version was very inefficient, but Morton Høgholm suggested a much better implementation, which vastly improved it.

Charles

You said “For the articles supplied in Word, I originally thought about Word to LaTeX converters, but I very soon discarded that idea. I don’t want to produce a close match to the original article, I want to produce the article in a format consistent with the rest of the document.” Have you looked at the docx2tex converter? It’s pretty much written with a “save the markup, throw away the formatting” mindset. Unfortunately, not maintained.

See the project page. An aside to my recommendation of docx2tex: it’s C#/.NET-based, so it’s Win32 technology, but it is supposed to work with Mono and so be usable on Linux. I’ve not tried to get it to work outside Windows.

Nicola

No, I haven’t. The problem with a lot of Word documents is that many Word users don’t use styles. For example, if they want to write a section header, they’ll just put some blank lines in, type the heading, select it, click on bold and do some more line breaks. It’s very difficult to get an automated system that can recognise that as a section heading, and what level of sectioning.

Charles

Yes, definitely. With those texts, your cut&paste technique is probably easiest.

Paulo

How did you become aware of the TeX.sx community? :)

Nicola

Joseph Wright mentioned it to me, but I didn’t join up straight away as it was an unfamiliar style of Q&A, and it takes me a while to get used to new ways of doing things. :-)

egreg

Sorry to come in so late. A big hug to Nicola.

Nicola

:-)

egreg

I’m also impressed with flowfram. I’ve personally never used it, as I’m more traditional in my page setup. But it has very nice features.

Nicola

Thanks :-) I first wrote it to help design posters, but these days I use it more for books. It does however have a lot of issues, especially regarding anything connected to the OR [Ed: Output Routine].

egreg

As Psmith would say, let’s wait till David Carlisle finishes doing xor.

Nicola

:-)

Charles

How many hours a month do you work for Microtome, generally? (I’m not just being idly curious, I’m interested in data for selling TeX-centric typesetting to small start-up publishers).

Nicola

I only work on a voluntary basis for the CiML books, so I only do the work when I have the time to spare (which is why there are still a couple of volumes pending).

Charles

I guess hours per volume makes more sense as a metric, then.

Nicola

It really depends on the quality of the source code that the authors provide.

Charles

And it’s probably impossible to give anyway, given that you are also writing the code for the workflow.

Nicola

As far as I can remember it took me about a week to work out how to get combine and hyperref to play nicely, but it is now getting easier as long as authors remember to use the jmlr class.

Charles

hyperref is evil. A necessary evil, but evil still.

Nicola

It is very useful, but it certainly can cause problems.

Paulo

Speaking of David Carlisle (of course, we need to tease him), which editor do you use? :)

Nicola

Hmm, let me think… vim. :-)

egreg

I was forgetting we have a package connection: you used my itnumpar in fmtcount. :)

Nicola

Yes :-) I’d really like to rewrite fmtcount with drop in language modules that other people can maintain, as my linguistic skills are appalling.

egreg

The main obstacle is that different languages have very different ideas about naming numbers!

Nicola

Yes, it is. I never intended to make fmtcount multilingual, just as I never intended to make datetime multilingual, but I got a lot of requests to make datetime compatible with babel and from there to make fmtcount multilingual. There are certainly days when I wish I hadn’t!

Paulo

Any plans for a new package, a new book, etc.? :) You did great on the audio extract, you might consider an audiobook in the future. :)

Nicola

My writing tutor suggested that the duck book would be great as a sticker book, but I’d have to find a printer and distributor for that. An audiobook would be great as well, but again I’d need to find how to go about burning, packaging and distributing. I also plan to release an ebook short story. That’s just waiting for a cover image (which Magdalene is painting). I’ve also got a novel pending, but that won’t be ready until at least next year.

I also plan to produce volume 3 of my LaTeX series, and have a new Java app pending.

Charles

If we’re going to be ambitious, what about a LaTeX class for doing children’s pop-up books?

Nicola

That would certainly be interesting. Again, it’s a case of finding someone to do the printing and distributing.

Paulo

Could you name something you really like in TeX/LaTeX? And is there something you dislike? :)

Nicola

The quality of typesetting is excellent. There are so many poorly typeset books these days. And I really like the way LaTeX separates content from style. As for dislike – I think that would have to be the output routine!

Paulo

Any plans for a LaTeX3 migration? :)

Nicola

Not yet. It’s on my huge list of things to do when I have some free time! :-)

I forgot to add (some may have already suspected) the patch Morton provided for datatool was originally in LaTeX3 syntax, but having no LaTeX3 experience I felt more comfortable converting all the underscores to @ characters. In hindsight, perhaps that would’ve been a good time to learn about it.

Paulo

:)

What do you recommend for a newbie eager to learn TeX, LaTeX and friends? :) Your books are great resources. :)

Nicola

Thanks :-) An easy to read tutorial that was written fairly recently to ensure they don’t start using out of date packages. I really like Kopka and Daly, but that’s getting a bit old.

user4035

I glanced through your LaTeX exercises. They are small and have solutions, so the student can check himself. Will you prepare more? Also you can improve this page by putting the text into 2 columns and squeezing “4. Using Special Characters” a little bit.

Nicola

I’m not planning on preparing more for that book. Two column layout is a bit awkward for HTML. It may look great in some browsers but unreadable in others.

user4035

Your site has good navigation. Did you prepare it from LaTeX?

Nicola

Thanks :-) I wrote most of the pages in HTML (in vim) but some of them, such as the LaTeX books and manuals I wrote in LaTeX and converted them to HTML using LaTeX2HTML. (I know a lot of people think it out-of-date, but there’s stuff I know how to do in LaTeX2HTML that I don’t know how to do in TeX4ht.)

user4035

The left menu, matching the current page, is especially good.

Nicola

That’s just in a floating div.

The online shop that I’m developing on my site is more complicated, but that’s not operational yet. That’s mostly osCommerce php files, which I’ve augmented.

texenthusiast

I have to appreciate your hard work and keen interest in spite of your many roles in daily life. Your Dickimaw LaTeX Series are one of the best tutorial and well written books in a lucid language. Thank you very much for all your LaTeX works, either packages, Jpgfdraw, children’s books and many more. All of them are really great.

I learnt to make MWE from your Creating a LaTeX Minimal Example, it’s a must for everyone debugging LaTeX or asking Q at tex.sx.

Nicola

Thank you. That’s really great to know. :-)

texenthusiast

I think we need more women like you and Ulrike madam in LaTeX.

Nicola

:-)

Paulo

Thanks a million for this awesome interview! :)

Nicola

Thanks :-)

Thank you everyone for turning up, and thanks Paulo for being a great interviewer.

By the way, no one’s asked yet, but my favourite ice cream is vanilla (the nice soft variety with a flake) :-)

texenthusiast

Great to know, I am butterscotch fan but often vanilla also.

Paulo

What’s your favourite ice cream flavour?! :)

Nicola

LOL :-) I also remember liking guaraná when I tried it in Brazil many years ago. Oh and I like lemon sorbet as well. :-)

Paulo

if you ever do an audiobook, can you invite Joseph Wright and David Carlisle as special guests? :)

Nicola

Ooh, that would be fun! :-)

texenthusiast

Thanks always for being our interviewer Paulo.

Paulo

ooh /blushes … I’m a just a humble servant. :)

texenthusiast

what is your motivation to develop java based tools like Jpgfdraw? like arara using java by Paulo.

Nicola

With Jpgfdraw (my first app) I decided it was about time I learnt Java and writing a graphics application was the motivation to do it. My favourite drawing application had always been Acorn’s !Draw, but I couldn’t use it any more when I moved over to Linux. Java isn’t perfect, but it’s platform independent, which is the main thing.

texenthusiast

is your cake made using Jpgfdraw? I liked it very well.

Nicola

Yes. I’ve done quite a few clip art style line drawings in Jpgfdraw.

(I used to be on our church’s fund-raising committee and I did posters and flyers for events.)

texenthusiast

great, ok, I am surprised how you manage all roles so perfectly?

Nicola

I’m surprised as well ;-)

texenthusiast

is the MWE doc in an update process?

Nicola

I might update it at some point and move it over to my Dickimaw website.

texenthusiast

I want to know how one gets a ISBN for a book; should we contact the publisher for that or can we do self-publishing and get one?

Nicola

ISBNs are assigned to publishers, so if you have a book with a publisher they will use one of their allocated ISBNs. If you want to self-publish, you have to set up a publishing company and buy a block of ISBNs. In the UK, the smallest block you can buy is 10. I believe in the US, ISBNs can be purchased individually, but always buy them from the ISBN agency – Nielsen UK in the UK. (I don’t know about other countries.) Never buy “second-hand” ISBNs.

(Self-publishing isn’t for the faint-hearted. It requires start-up capital and business expertise.)

texenthusiast

ok. thanks for your good advice.

thank you so much for answering questions madam. Thanks for your patience and kindness.

Nicola

That’s okay. It’s been very interesting.

texenthusiast

which Linux distro do you work on normally?

Nicola

Fedora.

It’s been great talking to everyone. Cheerio.

Paulo

:)


Stay tuned for the next episode of the TeXtalk!

Filed under Interviews

2 Comments

Subscribe to comments with RSS.

Leave a comment

Log in
with Stack Exchange
or