summaryrefslogtreecommitdiff
path: root/man/specgram.1
diff options
context:
space:
mode:
Diffstat (limited to 'man/specgram.1')
-rw-r--r--man/specgram.1355
1 files changed, 355 insertions, 0 deletions
diff --git a/man/specgram.1 b/man/specgram.1
new file mode 100644
index 0000000..e7a52cb
--- /dev/null
+++ b/man/specgram.1
@@ -0,0 +1,355 @@
+.TH SPECGRAM 1 "2020-12-29"
+
+.SH NAME
+specgram \- create spectrograms from raw files or standard input
+
+.SH SYNOPSIS
+.B specgram
+[\fB\-aehlqvz\fR]
+[\fB\-\-print_input\fR]
+[\fB\-\-print_fft\fR]
+[\fB\-\-print_output\fR]
+[\fB\-i, --input\fR=\fIRATE\fR]
+[\fB\-r, --rate\fR=\fIRATE\fR]
+[\fB\-d, --datatype\fR=\fIDATA_TYPE\fR]
+[\fB\-p, --prescale\fR=\fIPRESCALE_FACTOR\fR]
+[\fB\-b, --block_size\fR=\fIBLOCK_SIZE\fR]
+[\fB\-f, --fft_width\fR=\fIFFT_WIDTH\fR]
+[\fB\-g, --fft_stride\fR=\fIFFT_STRIDE\fR]
+[\fB\-n, --window_function\fR=\fIWIN_FUNC\fR]
+[\fB\-m, --alias\fR=\fIALIAS\fR]
+[\fB\-A, --average\fR=\fIAVG_COUNT\fR]
+[\fB\-w, --width\fR=\fIWIDTH\fR]
+[\fB\-x, --fmin\fR=\fIFMIN\fR]
+[\fB\-y, --fmax\fR=\fIFMAX\fR]
+[\fB\-s, --scale\fR=\fISCALE\fR]
+[\fB\-c, --colormap\fR=\fICOLORMAP\fR]
+[\fB--bg-color\fR=\fIBGCOLOR\fR]
+[\fB--fg-color\fR=\fIFGCOLOR\fR]
+[\fB\-k, --count\fR=\fICOUNT\fR]
+[\fB\-t, --title\fR=\fITITLE\fR]
+.IR [outfile]
+
+.SH DESCRIPTION
+\fBspecgram\fR generates nice looking spectrograms from raw data, based on the options provided in the command line.
+
+The program has two output modes: file output when \fIoutfile\fR is provided and live output when \fB\-l, \-\-live\fR is provided.
+The two modes are not necessarily mutually exclusive, but behaviour may differ based on other options.
+
+The program has two input modes: file input when the \fB\-i, \-\-input\fR option is provided, or stdin input otherwise (default behaviour).
+
+In file input mode, the file is read in a synchronous manner until EOF is reached, and the spectrogram is generated into \fIoutfile\fR.
+Only file output is allowed in this mode, so \fIoutfile\fR is mandatory and \fB\-l, \-\-live\fR is disallowed.
+
+In stdin input mode, data is read in an asynchronous manner and for an indefinite amount of time.
+The spectrogram is updated as new data arrives and output is buffered in memory.
+
+In either input modes, when receiving SIGINT (i.e. by user pressing CTRL+C in the terminal), the program stops listening to data and exits gracefully, writing \fIoutfile\fR if provided.
+This also happens in live output mode, when the live window is closed.
+If the program receives SIGINT again it will forcefully quit.
+
+See \fBEXAMPLES\fR for common use cases.
+
+.SH OPTIONS
+
+.TP
+.BR \fIoutfile\fR
+Optional output image file. Check \fISFML\fR documentation for supported file types, but PNG files are recommended.
+
+Either \fIoutfile\fR must be specified, \fB\-l, \-\-live\fR must be set, or both.
+
+.TP
+.BR \-h ", " \-\-help
+Display help message.
+
+.TP
+.BR \-v ", " \-\-version
+Display program version.
+
+.TP
+\fBINPUT OPTIONS\fR
+
+.TP
+.BR \-i ", " \-\-input =\fIINFILE\fR
+Input file name.
+If option is provided, \fIINFILE\fR is handled as a raw dump of values (i.e. input file format is not considered).
+The program will stop when EOF is encountered.
+
+If option is not provided, data will be read indefinitely from stdin.
+
+.TP
+.BR \-r ", " \-\-rate =\fIRATE\fR
+Rate, in Hz, of the input data.
+Used for display purposes and computation of other parameters.
+Program will not perform rate limiting based on this parameter and will consume data as fast as it is available on stdin.
+
+Default is 44100.
+
+.TP
+.BR \-d ", " \-\-datatype =\fIDATA_TYPE\fR
+Data type of the input data.
+Is formed from an optional complex prefix (\fIc\fR), a type specifier (\fIu\fR for unsigned integer, \fIs\fR for signed integer, \fIf\fR for floating point) and a size suffix (in bits: 8, 16, 32, 64).
+
+Valid values are: u8, u16, u32, u64, s8, s16, s32, s64, f32, f64, cu8, cu16, cu32, cu64, cs8, cs16, cs32, cs64, cf32, cf64.
+
+Complex types are pairs of two values containing the real and imaginary part of the number, in this order.
+The size of the complex data type is twice that of the basic type. For example cf64 is 128-bit wide, corresponding to two 64-bit values.
+
+Default is s16.
+
+.TP
+.BR \-p ", " \-\-prescale =\fIPRESCALE_FACTOR\fR
+Input prescale factor.
+
+The following normalizations are applied to input values, regardless if they are part of a complex number or not:
+ \(bu unsigned values are normalized to [0.0 .. 1.0] based on the domain limits.
+ \(bu signed values are normalized to [-1.0 .. 1.0] based on the domain limits.
+ \(bu floating point values are left untouched, with the exception of NaN which is converted to a zero.
+
+After this normalization, the new value is multiplied by \fIPRESCALE_FACTOR\fR.
+This is mostly useful for adjusting your inputs to the scale, and is usually needed for floating point inputs (see \fB\-s, \-\-scale\fR).
+
+Default is 1.0.
+
+.TP
+.BR \-b ", " \-\-block_size =\fIBLOCK_SIZE\fR
+Block size, in data type sized values, that are to be read at a time from stdin.
+The larger this value, the larger the latency of the live spectrogram.
+
+Default is 256.
+
+.TP
+\fBFFT OPTIONS\fR
+
+.TP
+.BR \-f ", " \-\-fft_width =\fIFFT_WIDTH\fR
+FFT window width.
+Lower values provide worse frequency resolution but better temporal resolution. Higher values provide better frequency resolution but worse temporal resolution.
+
+Default is 1024.
+
+.TP
+.BR \-g ", " \-\-fft_stride =\fIFFT_STRIDE\fR
+Stride (distance) between two subsequent FFT windows in the input.
+Value can be less than \fIFFT_WIDTH\fR in which case there is overlap between windows, larger than \fIFFT_WIDTH\fR in which case information is lost, or equal to \fIFFT_WIDTH\fR.
+
+Default is 1024.
+
+.TP
+.BR \-n ", " \-\-window_function =\fIWIN_FUNC\fR
+Window function to be applied to the input window before FFT is computed.
+Because of the discrete nature of the FFT, a periodic assumption is made of the input window.
+In reality the input window is mostly never periodic, so window functions are used to taper off the ends of the window and avoid jumps between the beginning and end samples.
+
+Valid values are: none, hann, hamming, blackman, nuttall.
+
+Default is hann.
+
+.TP
+.BR \-m ", " \-\-alias =\fIALIAS\fR
+Specifies whether aliasing between negative and positive frequencies exists.
+If set to true (\fI1\fR), then the bins of corresponding negative and positive frequencies are summed on both sides.
+
+Default is \fI0\fR (no) for complex data types and \fI1\fR (yes) otherwise.
+
+.TP
+.BR \-A ", " \-\-average =\fIAVG_COUNT\fR
+Number of windows to average before the mean is displayed.
+
+Use this for high sample rate signals, where either displaying many windows or computing too wide a window is not possible.
+
+Default is 1.
+
+.TP
+\fBDISPLAY OPTIONS\fR
+
+.TP
+.BR \-q ", " \-\-no_resampling
+Disables resampling of output FFT windows, generating clean and crisp output.
+This invalidates the use of \fB\-w, \-\-width\fR, as the actual display width is computed from other parameters.
+
+.TP
+.BR \-w ", " \-\-width =\fIWIDTH\fR
+Display width of spectrogram.
+Output FFT windows are resampled to this width, colorized and displayed.
+Cannot be used with \fB\-q, \-\-no_resampling\fR.
+
+Default is 512.
+
+.TP
+.BR \-x ", " \-\-fmin =\fIFMIN\fR
+Lower bound of the displayed frequency spectrum, in Hz.
+
+Default is -\fIRATE\fR/2 for complex data types, 0 otherwise.
+
+.TP
+.BR \-y ", " \-\-fmax =\fIFMAX\fR
+Upper bound of the displayed frequency spectrum, in Hz.
+
+Default is \fIRATE\fR/2.
+
+.TP
+.BR \-s ", " \-\-scale =\fISCALE\fR
+Spectrogram scale.
+Valid values are: dBFS.
+
+Default is dBFS.
+
+\fB[dBFS] NOTE:\fR By default, the scale has a -120dB lower bound.
+You can adjust it by appending the custom lower bound after the scale string (e.g. \fB\-s dBFS-60\fR for a -60dB lower bound).
+
+\fB[dBFS] NOTE:\fR The peak amplitude assumed for dBFS, after normalization and prescaling (see \fB\-p, \-\-prescale\fR), is 1.0.
+Thus, the correct input domains are:
+ \(bu [0 .. TYPE_MAX] for real unsigned integer values
+ \(bu [-TYPE_MAX .. TYPE_MAX] for real signed integer values
+ \(bu [-1.0 .. 1.0] for real floating point values
+ \(bu { x | abs(x) <= TYPE_MAX } for complex signed and unsigned integer values
+ \(bu { x | abs(x) <= 1.0 } for complex floating point values
+
+Input values outside these domains may lead to positive dBFS values, which will be clamped to zero.
+Use prescaling (\fB\-p, \-\-prescale\fR) to adjust your input to this domain.
+Integer inputs don't usually need prescaling, as they are normalized based on their domain's limits.
+
+.TP
+.BR \-c ", " \-\-colormap =\fICOLORMAP\fR
+Color scheme.
+Valid values are: jet, gray, purple, blue, green, orange, red.
+
+If \fICOLORMAP\fR is neither of these values, then it is interpreted either as a 6 character hex string (RGB color) or an 8 character hex string (RGBA color).
+In this case, a gradient between the background color and the color specified by the hex string will be used as a color map.
+
+Default is jet.
+
+.TP
+.BR \-\-bg-color =\fIBGCOLOR\fR
+Background color. Either a 6 character hex string (RGB color) or an 8 character hex string (RGBA color).
+
+Default is 000000 (black).
+
+.TP
+.BR \-\-fg-color =\fIFGCOLOR\fR
+Foreground color. Either a 6 character hex string (RGB color) or an 8 character hex string (RGBA color).
+
+Default is ffffff (white).
+
+.TP
+.BR \-a ", " \-\-axes
+Displays axes.
+
+.TP
+.BR \-e ", " \-\-legend
+Displays legend. Entails \fB\-a, \-\-axes\fR.
+
+This is enabled in live view, but only for the live window (i.e. if both live view and file output are used, then file output will only display a legend if this flag is set by the user).
+
+.TP
+.BR \-z ", " \-\-horizontal
+Rotates histogram 90 degrees counter clockwise, making it readable left to right.
+
+.TP
+.BR \-\-print_input
+Prints input windows to standard output, after normalization and prescaling (see \fB\-p, \-\-prescale\fR).
+
+.TP
+.BR \-\-print_fft
+Prints FFT result to standard output, in FFTW order (i.e. freq[k] = \fIRATE\fR*k/N).
+
+.TP
+.BR \-\-print_output
+Prints output, before colorization, to standard output. Values are in the domain [0.0 .. 1.0].
+
+The length of the output may be different than the FFT result or the input, depending on specified frequency bounds (see \fB\-x, \-\-fmin\fR and \fB\-y, \-\-fmax\fR).
+Negative frequencies precede positive frequencies.
+
+.TP
+\fBLIVE OPTIONS\fR
+
+.TP
+.BR \-l ", " \-\-live
+Displays a live rendering of the spectrogram being computed.
+
+Either this flag must be set, \fIoutfile\fR must be specified, or both.
+
+.TP
+.BR \-k ", " \-\-count =\fICOUNT\fR
+Number of FFT windows displayed in live spectrogram.
+
+Default is 512.
+
+.TP
+.BR \-t ", " \-\-title =\fITITLE\fR
+Title of live window.
+
+Default is 'Spectrogram'.
+
+.SH EXAMPLE
+
+.LP
+One of the most obvious use cases is displaying a live spectrogram from the PC audio output (you can retrieve \fIyourdevice\fP using "\fBpactl list sources short\fR"):
+
+.IP
+parec --channels=1 --device="\fIyourdevice\fR.monitor" --raw | \fBspecgram\fR -l
+
+.LP
+This will assume your device produces 16-bit signed output at 44.1kHz, which is usually the case.
+
+If you want the same, but wider and with a crisp look:
+
+.IP
+parec --channels=1 --device="\fIyourdevice\fR.monitor" --raw | \fBspecgram\fR -lq -f 2048
+
+.LP
+If you also want to render it to an output file:
+
+.IP
+parec --channels=1 --device="\fIyourdevice\fR.monitor" --raw | \fBspecgram\fR -lq -f 2048 \fIoutfile.png\fR
+
+.LP
+Keep in mind that when reading from stdin (like the above cases), the program expects SIGINT to stop generating FFT windows (e.g. by pressing CTRL+C in terminal).
+The file \fIoutfile.png\fR will be generated after SIGINT is received.
+
+Generating from a file to a file, with axes displayed and a crisp look:
+
+.IP
+\fBspecgram\fR -aq -f 2048 -i \fIinfile\fR \fIoutfile.png\fR
+
+.LP
+Generating from a file to a file, with axes and legend displayed, but zooming in on the 2-4kHz band:
+
+.IP
+\fBspecgram\fR -e -f 2048 -x 2000 -y 4000 -i \fIinfile\fR \fIoutfile.png\fR
+
+.LP
+Render a crisp output with a transparent background, so it can be embedded in a document:
+
+.IP
+\fBspecgram\fR -qe --bg-color=00000000 -i \fIinfile\fR \fIoutfile.png\fR
+
+.SH BUGS
+
+Frequency bounds (\fB\-x, \-\-fmin\fR and \fB\-y, \-\-fmax\fR) may exceed FFT window frequency limits when resampling is enabled (i.e. default behaviour), but may not do so when resampling is disabled (\fB\-q, \-\-no_resampling\fR).
+This inconsistency is known behaviour and, while not necessarily nice, does not impact usability in a meaningful manner.
+Ideally exceeding these limits should be allowed in both cases, and zero padding should be performed.
+
+Moreover, when using the \fB\-q, \-\-no_resampling\fR flag, the frequency limits are \[+-]\fIRATE\fR*(\fIFFT_WIDTH\fR-1)/(2*\fIFFT_WIDTH\fR) when \fIFFT_WIDTH\fR is odd
+and -\fIRATE\fR*(\fIFFT_WIDTH\fR-2)/(2*\fIFFT_WIDTH\fR) to \fIRATE\fR/2 when \fIFFT_WIDTH\fR is even.
+This is a bit different from the behaviour of NumPy's implementation of fftfreq and aims to make it easier to display the Nyquist frequency component for non-complex inputs.
+
+The above upper limits are enforced silently in the default values of \fB\-x, \-\-fmin\fR and \fB\-y, \-\-fmax\fR, but for brevity are not mentioned in this manpage's \fBOPTIONS\fR section or in the program help screen.
+
+.SH AUTHORS
+
+Copyright (c) 2020 Vasile Vilvoiu <vasi.vilvoiu@gmail.com>
+
+\fBspecgram\fR is free software; you can redistribute it and/or modify it under the terms of the MIT license.
+See LICENSE for details.
+
+.SH ACKNOWLEDGEMENTS
+
+Taywee/args library by Taylor C. Richberger and Pavel Belikov, released under the MIT license.
+
+Program icon by Flavia Fabian, released under the CC-BY-SA 4.0 license.
+
+Share Tech Mono font by Carrois Type Design, released under Open Font License.
+
+Special thanks to Eugen Stoianovici for code review and various fixes. \ No newline at end of file