Sunday, May 9, 2010

RhinoSpike: My suggestions to make getting recordings from native speakers even awesomer

In case you hadn't heard, I'm something of a fan of RhinoSpike, a website that lets you get native-speaker recordings of target language text, which you can then get in MP3 format and add to your spaced-repetition system, listen to in iTunes, etc.

As I've been using it more, it's become clearer to me the places I'd like to see improved. I'll run through them, after the jump.

You may remember that, when RhinoSpike first contacted me about their product, they wrote:
You download [the audio file] and add it to your Anki/SRS flashcards…
Here's the current problem with doing that: there are way too many steps.

Let's say you've got ten vocabulary words that you're putting into Anki for which you want a recording. You've got to enter each of those into RhinoSpike one at a time. To submit a recording request on RhinoSpike, you have to (1) add a title, (2) select the language from a drop-down menu, (3) add the text to be recorded, and (4) press the "create" button. Optionally, you can includes notes or tags. So, with just ten words, you've got at least 40 steps just to get the recordings for all of them.

So my first suggestion is the ability to submit recording requests in batches. Ideally, you'd just be able to upload a simple text file to accomplish this. That way, you can quickly compile all your requests in a text file and then take that file and upload it. I imagine that doing so would reduce the process to three steps, regardless of the number of request that you have: (1) make a text file of your recording requests, (2) select that file for uploading, and (3) press submit.

But even if you make that end of the process a cake walk, getting your recordings is still a multi-step process. Let's say you've gotten recordings for those ten words. They're now on your audio request page (as an example, here's mine). Let's say you now want to add each of those recordings to iTunes. For each one, you need to (1) click the "listen" link in the audio request list, (2) control-click the "MP3" link and select "Download Linked File As…", (3) type in a name for the file (RhinoSpike currently names the files with some meaningless series of numbers) and save it, (4) drag it from wherever you saved it to iTunes, and (5) add in whatever file info you need in iTunes (track name, etc.). That's 50 more steps just for 10 words. Ouch.

What's needed here is a way to make batch downloads of audio files directly to iTunes, similar to what happens when you download music from Amazon's or eMusic's online stores. This could start with a list of all of my recordings, clicking a checkbox next to the ones I want, and then getting them all downloaded to iTunes. The downloader would also automatically fill out the iTunes info fields: "Name" in iTunes would be RhinoSpike's "Title" field, "Artist" would be the recording user's name on RhinoSpike, "Year" would be the year recorded, "Album" would be "RhinoSpike", "Genre" would be RhinoSpike's "Language" field + "Spoken", "Lyrics" would be RhinoSpike's "Text" field, and they could even add some fancy-schmancy RhinoSpike cover art and check "Part of a compilation" so it displays as a single album.

Yeah, I could rock out to that.

To make it even awesomer, there'd be an option to "Automatically download all new recordings to iTunes". Then, as they come, you wouldn't need to do a thing but listen to your beautiful, native-speaker recordings. That would effectively mean you'd have no repetitive steps to do whatsoever on the downloading end of the process.

That takes care of iTunes, but what about adding them to your spaced-repetition system? I mentioned before that Anki let's you copy and paste an MP3's URL into an Anki field and then loads that MP3 into the field. That of course saves you a ton of steps, but you still have to do this manually, item by item, which means there are definitely more savings to be had.

So, once again, I'd love to see the ability to do this in batches. For Anki, this could come in the form of a plug-in that automatically adds RhinoSpike URLs to the Anki fields where the title and language match (I'd use the title field rather than the text field so that so that for, say, a Japanese word, you can have the audio play with the pronunciation (e.g., kana in the title field) rather than with the text (e.g., a kanji word).) The goal here, once again, would be for the user to have no further work to do in this process.

The suggestions above are all aimed at reducing the amount of time the user needs to spend doing things, but now lets turn to reducing the wait time for recordings. First and foremost, and mentioned in my earlier post, in-site recording is needed, similar to what's found on Livemocha or Cinch. RhinoSpike has already said that that's coming in a subsequent update, so we can look forward to that.

Besides making it easier to make recordings, another way to speed up the process it to automatically connect existing recordings with new requests. For instance, let's say I request and get a recording of the word "almorçar" in Portuguese. RhinoSpike now has a recording of that word, so when another person comes along and makes the same request, it can be satisfied immediately by simply using the existing recording on the system. That'll mean no wait for your recording if it's already on the system, and the number of things already on the system will grow over time.

In what order should recordings be automatically attached? In order of quality, I'd say, and for that they'd need a ranking mechanism. That would cause high-quality recordings to float to the top, and those would be the first ones linked to repeat requests.

Lastly, I've got a few miscellaneous suggestions. First, you should be able to listen to recordings directly from your audio request list. As is, you need to click on "listen" before you can do that. This extra click should be eliminated (and then you won't have to wait for the pages to reload when you want to listen to one recording after another). Secondly, when putting up multiple requests, your last-selected language and last-input keywords should be remembered. Currently, the language always defaults to one particular language (Spanish, in my case, whereas I've been using the site primarily for Japanese, meaning that I've got to change it manually every single time) and the tags field always defaults to blank.


  1. Logistically, I wonder if there's a strong benefit to RhinoSpike making it a lot easier to upload requests. Because the easier it is to upload large batches of requests, the more work needs to be done on the other end by the person recording. Especially when you're breaking your recording requests up into smaller bits, there's additional time that the recorder needs to spend for each file - accessing the text, recording, uploading the file. Think about which would take longer: thirty 10-second recordings or one 5-minute one?

    So you can't effectively make it easier for the audio requesters without also making it easier on the other end. If you do, the site ends up with a surplus of unanswered requests.

  2. I thought of that myself, but I think if they make it easier to record (i.e., push a button on the site to start recording) and match up existing recordings with new, matching requests, they'll be able to overcome any request overload problem.

    They also don't seem to be having a lot of trouble fulfilling requests as is, so they can probably handle even more requests without making any changes at all.

    That said, I'd definitely make the in-site recording priority number one. Matching recordings is technically not all that tough, so I'd probably make that priority number two. From there, I'd move on to making batch uploads and downloads.

  3. You could put all the vocabulary words in a list in one audio request, and then using a program to split up the audio afterward. I think that would be more efficient.

    I think Rhinospike and Anki are best used for paragraphs rather than individual vocabulary words anyway.

  4. I was thinking of doing the same thing, but I'm not sure that that'd be quick. You'd have to go through the audio file, find each break point, break them apart and save them. To me, that seems like it'd take more time, but I suppose I'd have to try it out to say which one's slower or faster. In any case, it's definitely not as simple as it could be if you have to do that.

    In Anki, I use single words together with example sentences using that particular word—typically the sentence I got the word from in context. I usually just record the word to get it's pronunciation repeated, but sometimes I get the sentences recorded as well. Paragraphs would seem to create a lot of inefficiency, because if there's only one word in the paragraph that you don't know, you have to repeat everything else as well.

  5. Ah I do a similar thing in Anki, I have sentences with one word I wasn't familiar with when I saw it first.

    I don't use a whole paragraph in Anki, I meant that if there was a passage that there were multiple words you didn't know you could put the whole thing in RhinoSpike and then split it into sentences where you use the ones you need. It would make the person read more than necessary, but it's good listening practice :)

  6. That would certainly be a lot less effort than breaking an audio file down word by word, but it seems like it still wouldn't be as quick as getting separate recordings and then doing what it takes to get them into Anki.

  7. If you are looking for individual words, I think it is eassier to try out finding them on than making a request on rhinospikes.