summaryrefslogtreecommitdiff
path: root/README.md
blob: 30e061d6e15694576b00fd723c05dd5bb9b4aae2 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
# specgram

![build](https://github.com/rimio/specgram/workflows/build/badge.svg)

Small program that computes and plots spectrograms, either in a live window or to disk, with support for stdin input.

## Preview

![example](resources/readme_images/example.png?raw=true "Example")

## Build and install

If you are using ArchLinux you can grab the latest release from the AUR package [specgram](https://aur.archlinux.org/packages/specgram/), or get the main branch with [specgram-git](https://aur.archlinux.org/packages/specgram-git/).

Otherwise, you can build and install the program from source:
```bash
# Clone the repo
git clone https://github.com/rimio/specgram.git
cd specgram && mkdir build && cd build

# Build
cmake ..
make

# Install
sudo make install
```

## Dependencies

This program dynamically links against [FFTW](http://www.fftw.org/) and [SFML 2.5](https://www.sfml-dev.org/).

The source code of [Taywee/args](https://github.com/Taywee/args) is embedded in the program (see ```src/args.hxx```).

## Usage

For a complete description of the program functionality please see the [manpage](man/specgram.1.pdf).

### Input and output modes

**specgram** has two mutually exclusive input methods: from standard input and from file.

In order to generate a spectrogram from an input file ```infile``` and write the output to ```output.png```:

```bash
specgram -i infile outfile.png
```

If no input file is specified, then the default behaviour is to read input data *indefinitely* from standard input.
For example, we can query PulseAudio for audio sources:

```bash
$ pactl list sources short
1	alsa_output.usb-BEHRINGER_UMC204HD_192k-00.analog-surround-40.monitor	module-alsa-card.c	s16le 4ch 44100Hz	IDLE
2	alsa_input.usb-BEHRINGER_UMC204HD_192k-00.analog-stereo	module-alsa-card.c	s32le 2ch 44100Hz	SUSPENDED
3	alsa_output.pci-0000_00_1f.3.iec958-stereo.monitor	module-alsa-card.c	s16le 2ch 44100Hz	SUSPENDED
4	stereo.A.monitor	module-remap-sink.c	s16le 2ch 44100Hz	IDLE
5	stereo.B.monitor	module-remap-sink.c	s16le 2ch 44100Hz	SUSPENDED
11	alsa_output.pci-0000_01_00.1.hdmi-stereo.monitor	module-alsa-card.c	s16le 2ch 44100Hz	SUSPENDED

$ export PASOURCE="stereo.A.monitor"
```

In my case the default sink is ```stereo.A```, and I use the monitor source ```stereo.A.monitor``` to capture what I'm hearing in my headphones:

```bash
$ parec --channels=1 --device="${PASOURCE}" --raw | specgram outfile.png
[2020-12-27 15:56:15.058] [info] Creating 1024-wide FFTW plan
[2020-12-27 15:56:15.058] [info] Input stream: signed 16bit integer at 44100Hz
```

The program will keep reading data and computing FFTs until end of times or until it receives a ```SIGINT```.
In the Linux terminal this can be achieved by pressing ```CTRL+C```.

Once the signal is received, the program stops reading data from input and writes to ```outfile.png``` whatever it cached so far:

```bash
$ parec --channels=1 --device="${PASOURCE}" --raw | specgram outfile.png
[2020-12-27 15:57:29.967] [info] Creating 1024-wide FFTW plan
[2020-12-27 15:57:29.967] [info] Input stream: signed 16bit integer at 44100Hz
^C[2020-12-27 15:57:31.813] [info] Terminating ...
```

It is sometimes useful to see the spectrogram in real time. Live mode can be enabled with the ```-l, --live``` flag:

```bash
$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l outfile.png
```

The spectrogram will now be displayed as it is being compute from standard input.
When either ```SIGINT``` is received or the live window is closed, the program will terminate and write ```outfile.png```.

If file output is not desired, only live mode can be used, and nothing is written upon termination:

```bash
$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l
```

For obvious reasons, live mode cannot be used with file input.

### Input options

In the above examples we assumed that the program input is 16-bit signed integer at 44.1kHz, which happens to be what my sound card (and many others) outputs by default.

We can, however, specify any other rate with ```-r, --rate``` and datatype with ```-d, --datatype```:

```bash
$ parec --channels=1 --device="${PASOURCE}" --raw --format=float32 --rate=48000 | specgram -l -r 48000 -d f32
```

The above example will read 32-bit floating point input at 48kHz.
For a full list of supported data types see the [manpage](man/specgram.1.pdf).

**NOTE:** The specified rate is used only for display purposes and for interpreting other command line parameters.
There's nothing stopping us from using a different rate than the actual device rate, going down to nanohertz:

```bash
$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l -r 1e-8
```

#### Usage with FFmpeg

In order to generate a spectrogram for an encoded audio file, it is neccesarily to decode it first. This can be done with FFmpeg, using any of the [raw audio formats available](https://trac.ffmpeg.org/wiki/audio%20types#SampleFormats).

For example, to generate the spectrogram for an MP3 file:

```bash
$ ffmpeg -i input.mp3 -f s16le - | specgram output.png
```

Or, in order to use 32-bit data:

```bash
$ ffmpeg -i input.mp3 -f s32le - | specgram -d s32 output.png
```
Note that you will have to manually stop specgram with a SIGINT once the ffmpeg stream is finished.

### FFT options

The FFT window width can be specified with ```-f, --fft_width``` and the stride, that is the distance between the beginning of two subsequent FFT windows, can be specified with ```-g, --fft_stride```:

```bash
$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l -f 2048 -g 1024 
```

The above will compute 2048 elements wide FFT windows with a stride of 1024 elements, that is with a 50% overlap between windows.

Usually, a larger FFT window will give better frequency resolution but worse time resolution (i.e. it will be harder to locate signals in the time domain).

A smaller stride will give a *smoother* and richer output, but will strain the CPU more.

**NOTE:** You will notice that there isn't much difference between the output of the above command and the others.
That is because the display width is different from the FFT window width.
To change the display width, see **Display Options** below.

Lastly, if you encounter high sample rate signals, for which you can't display a wide enough (or often enough) window, you can use window averaging (```-A, --average```).

```bash
$ rx_sdr -d 0 -g 50 -f 97300000 -s 960000 -F CF32 - | ./specgram -lq -r 960000 -d cf32 -A 20 
```

The above example consumes input at 960k samples per second from a RTL-SDR dongle, which at a 1024 wide FFT window would mean displaying over 900 windows per second; a bit much for the average PC, and for the average human to follow.

Averaging 20 windows gives us a much more reasonable 47 windows per second.

### Display options

To change the display width we can use ```-w, --width```:

```bash
$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l -f 1024 -w 1024
```

As you will notice, the spectrogram is somewhat blurry, because the program is resampling the 513 element wide positive part of the FFT output to the display width of 1024.
If you need sharp, crisp spectrograms, then you can use ```-q, --no_resampling``` to disable resampling:

```bash
$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -lq -f 1024
```
When not resampling, you can no longer specify the display width, as it is computed from the rest of the parameters.

Another use case is displaying a specific band of frequencies, using ```-x, --fmin``` and ```-y, --fmax``` to set the frequency bounds.
For example, to zoom in on the 500-3000Hz band:

```bash
$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -lq -f 9192 -x 500 -y 3000 
```

The colormap can be specified with ```-c, --colormap```; see image below the example for supported colormaps:

```bash
$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l -c orange
```

![colormaps](resources/readme_images/colormaps.png?raw=true "Colormaps")

In order to see how to specify the background, foreground and custom colormap colors, please see the [manpage](man/specgram.1.pdf).

While the live view has both axes and legend enabled by default, file output does not.
To enable them use ```-a, --axes``` or ```-e, --legend```.
Please note that axes are implicit when displaying the legend.

```bash
$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -le outfile.png
```

Finally, if you'd like to rotate the spectrogram 90 degrees counter-clockwise, so as to read it from left to right, you can use ```-z, --horizontal```:

```bash
$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -lz
```

This flag applies to both the live window and output file.

### Live options

Use ```-k, --count``` to control the number of FFT windows displayed in live view:

```bash
$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l -k 128
```

Use ```-t, --title``` to specify the live window title:

```bash
$ parec --channels=1 --device="${PASOURCE}" --raw | specgram -l -t "My spectrogram"
```

## License

Copyright (c) 2020-2023 Vasile Vilvoiu \<vasi@vilvoiu.ro>

**specgram** is free software; you can redistribute it and/or modify it under the terms of the MIT license.
See LICENSE for details.

## Acknowledgements

Taywee/args library by Taylor C. Richberger and Pavel Belikov, released under the MIT license.

Program icon by Flavia Fabian, released under the CC-BY-SA 4.0 license.

Share Tech Mono font by Carrois Type Design, released under Open Font License.

Special thanks to Eugen Stoianovici for code review and various fixes.