Write an audio metadata reader in Dart
A while ago, I tried to build my own music player. Flutter is great for this use case because of its multiplatform capabilities.
What do we need to build a good music player? A library to read audio metadata. The metadata is information about the artist’s name, the album’s name, the track number, the length, the release year, the cover image… Tons of very useful information we must display to the user.
During my journey, I searched for a suitable library that met my needs.
Let’s start with metadata_god
. This library serves as a bridge to the Rust library lofty-rs
. It functions reasonably well but does not provide all the metadata I required. I attempted to fork the repository and submit a pull request (PR) to address the issue. However, fixing the problem required me to learn Rust, comprehend the lofty-rs
library (which is rather complex), submit my PR, wait for its approval and merging, then submit another PR to metadata_god
and wait again. Additionally, the flutter_rust_bridge
only works for Flutter apps, not for simple Dart CLI apps.
When I tried to parse my music folder (3000 tracks, 29GB), the libraries couldn’t parse some tracks:
[ERROR:flutter/runtime/dart_vm_initializer.cc(41)] Unhandled Exception: Instance of 'FrbAnyhowException'
If I wrapped the readMetadata
function with a try/catch, I caught 87 errors and it took 2.3s to read all the metadata and keep the cover data in memory.
The other library I investigated is called flutter_media_metadata
. It serves as a bridge between Flutter and a C++ library. For some obscure reasons, the library doesn’t work anymore on my Linux system. When I tried it, it was slow… very slow. I’m using an app called Harmonoid
that implements this library and it took 44s to index my folder.
All these libraries are essentially bridges with another library but we lack a native Dart library. So why not develop a Dart library to read audio metadata? Reading metadata is IO-bound and Dart performs well in such scenarios. Additionally, I recall that a flutter_rust_bridge
library weighs at least 3MB so developing a native Dart library could potentially reduce the app size.
Let’s now delve into the marvelous world of audio containers and binary formats! This article serves more as an introduction to audio metadata readers, where we’ll only scratch the surface of the specifications of audio metadata, rather than being a comprehensive Dart/Flutter tutorial.
ID3 Tag
ID3 is the metadata format used by mp3
and other audio formats. 4 versions are used: v1
, v2
, v3
, v4
.
First, we read the 10 first bytes of the file. The 3 first bytes must be equal to ID3
. With this information, we will disregard the file extension entirely and focus solely on the bytes. Bytes 4 and 5 indicate the major and minor version of the ID3 tag, while the remaining bytes represent the size of the content of the ID3 tag.
Now, we can parse the frames. Each frame consists of a header and its content. The header contains an “ID” that informs us about the type of content it represents, such as the album’s name, the length of the content, and some flags. We continue reading frames until we reach the end of the content.
The frame ID consists of 4 bytes. We utilize a switch statement to extract the information based on the frame ID. For example, TALB represents the album’s name, TBPM represents the BPM, and so on.
The duration of the track is not always written in the metadata; therefore, we have to calculate it. To do this, we need to read the first MP3 frame, extract the bitrate and the frequency, and then use a simple formula to calculate the duration.
MP4 Metadata
First, to verify that the file uses MP4 metadata, the 4 first bytes must be equal to ftyp
.
This format can be a bit complex. It’s composed of nested boxes that provide information about various aspects such as audio bitrate, video bitrate, and so on.
box 1
box 2
box 3
- artist
- album
- disc
The nested boxes
The strategy involves reading the header of a box, which contains information about the box name and its size. MP4 metadata consists of a list of predefined boxes that we can utilize. We navigate through these boxes until we encounter a box with a name starting with ©
. There are a few of them, each defining specific information.
The box mvhd
contains the bitrate and the frequency, allowing us to compute the track’s duration once again.
Vorbis Comment
This one is straightforward. I will only handle flac
, but the code structure is nearly identical for ogg, vorbis, and opus formats.
The flac file consists of blocks, each identified by an ID between 0 and 127. IDs between 0 and 6 hold special significance. We will read blocks with IDs 0 (streaminfo), 4 (vorbis comment), and 6 (picture).
From block 0, we will extract the samplerate, bits per sample, and the total samples in the stream. Now, the formula totalSamplesInStream / sampleRate * 1000
provides us with the duration of the track.
Block 4 contains the current metadata. In comparison with ID3, there isn’t as much information. All the information is defined in the specifications of the Vorbis Comments. They are structured like this: ARTIST=ACDC
.
Block 6 contains the data for one picture. We can determine whether the picture is the front or back cover of the album, and it’s possible to have more than one picture.
Conclusion
First, I’ve learned that implementing these metadata readers is challenging not because the code to write is complex, but because the documentation is difficult to find, not freely accessible, or not sufficiently clear.
Now that the library is complete and ready to crawl my music folder, I run the code and… it’s very fast. It only needs 0.9s to parse all my tracks.
The compressed package size is only 30kb. It means that after the compilation, we would probably have something smaller. On the other side, MetadataGod
adds a .so
file of 6MB on Linux, and the Android apk is 2.7MB bigger.
It shows that it can be worth to write IO-bound libraries in Dart instead of using a bridge to a C++/Rust library. It makes the maintenance by the Dart/Flutter easier and faster. And because the code is pure Dart, we have guarantee that it will work the same for every platform!
If I add to emit a critique about my current library, I’d say it’s incomplete. We can only read metadata, not write or edit them. I don’t think it would be very hard to implement, just annoying because it would need to recompute all the length
fields for every format.