IRCAM - AudioSculpt - support

This support page is current for AudioSculpt 2.9.4.

AudioSculpt is a suite of software tools bundled into one package. Most AudioSculpt analysis/processing depends on the principle of the phase vocoder and the Short-time Fourrier Transform. Audiofiles (or in the case of Max/MSP -- live signal (see the SVP objects in Max)), are transformed into the frequency domain representation allowing for fine-grain processing of individual frequency components of a sound.

Other uses of this program include: sonogram analysis, time stretching, marker assignment based on amplitude or frequency material, partial tracking, virtual fundamental recognition, reduction of frequency material to relevant perceptual elements, brick-wall filtering and all other filtering operations, generalized cross-synthesis, source-filter cross-synthesis, synthesize from edited partials, image filtering (taking a digital photo and using that as a grayscale filter).
--------------

AudioSculpt does this:

1) sound is submitted to an analysis (most often FFT based) describing the spectral evolution of the sound in time.

2) the spectral data is modified through some kind of process

3) a new sound is produced through re-synthesis of the processed data that was based on the original sound.

IMPORTANT: PHASE VOCODING GENERALLY WORKS WELL WITH HARMONIC, STATIC, OR SMOOTHLY CHANGING TONES.
NOISY SOUNDS LIKE RASPY OR BREATHY VOICES OR ANY SOUND THAT IS RAPIDLY CHANGING ON A
TIMESCALE OF A FEW MILLISECONDS WILL BE DIFFICULT AND WILL PROBABLY RESULT IN THE INTRODUCITON OF
UNWANTED SONIC DEBRIS USUALLY CAUSED BY ANALYSIS ERRORS (PHASE DISTORTIONS). not surprising when you think of trying to represent white noise as a bunch of amplitude varying sinetones (yuch!!).

Recent versions of Audiosculpt have introduced other analysis methods and improvements that allow for working with inharmonic and noisy signals.

AudioSculpt TIP: ALWAYS NORMALIZE THE NEW SOUNDFILE WHEN PROCESSING AND SAVING TO DISK
AudioSculpt TIP: YOU MIGHT CONSIDER USING THE AUDIOSCULPT 32 BIT SOUND FORMAT FOR PROCESSING REQUIRING EXTRA-FINE RESOLUTION.
this format only works in AudioSculpt- but once you are done you can save as AIFF.

-----------------------------------------
SONOGRAM ANALYSIS

TIP: analysis parameters can be fine tuned but factory settings work in most cases. Try those first.

The five types of Analyses are"

1) FFT -(Short Time) Fourier Transform (spectrum divided into discrete number of amplitude varying sinusoidal components) -- THIS IS THE ONE TO USE MOST OFTEN

2) LPC - linear predictive coding (spectrum modeled as a number of filters with center freq,bandwidth and amplitude

3) Discrete Cepstrum (no idea what this is)
4) Reassigned Spectrum (no idea what this is)
5) True Envelope (no idea what this is)

included with these forms of analysis is:

• peak detection in short-time spectra;
• an estimation of the fundamental frequency;
• the evaluation of the perceptual pitches and weights (Terhardt’s algorithm).
• Partial Tracking: detection and tracking of the partials in the spectral analysis.
• Event Detection: detection of spectral changes that may correspond to the onset of notes or to other events in the
sonogram (the Place Markers feature)
• Formant analysis
• Loudness: calculation of the loudness according to a new algorithm by Moore and Glasberg.

one of these forms of analysi modify the soundfile. The results are
stored in a text file and/or shown on the sonogram display.

------------
ANALYSIS PARAMETERS

FFT (Fast Fourrier Transform) analysis (results are displayed as a sonogram image or a text file with tons of data)

WINDOW SIZE: the number of samples (a small segment of waveform) that will be analyzed at any given time.

"The main factor determining the analysis properties of spectral or partial analysis is the window size. The
window size determines the time and frequency resolution of the analysis that will be performed. "

• The "Window Size" and "Fundamental Frequency" fields are interdependent.
• When the window size is altered, other parameter settings will automatically update.
• A large window size returns better spectral (frequency domain) information at the expense of temporal resolution.
• A small window size returns better temporal resolution at the expense of spectral resolution and
often introduces some fairly audible distortion components (flanging effects)

FUNDAMENTAL FREQUENCY:

this is an estimation of the lowest frequency in your sound file.
low frequencies are hard to find with fourrier analysis (big window required).
here are the window sizes to represent certain FO's.

window size 8192 fund. freq 26.9
window size 4096 fund. freq. 53.8
window size2048 fund. freq. 107
window size1024 fund.freq 215
window size512 fund freq. 430

WINDOW TYPE:

Window type is just a bell-like amplitude frame that smooths out either end of the frame -- the various window types use different functions to create this smoothing curve. It is not as important a choice as the window size.

• Default Window type is Blackman and works for most sounds.
• If the sound has spectral components that are close in frequency choose Hanning or Hamming window
• If the sound has only a few distinct spectral components try the Blackman window

FFT SIZE:

"FFT size is in AudioSculpt is always forced to be a power of 2. For most analysis
types the selection of a proper FFT size is less important with respect to the achieved analysis properties
than the selection of a proper window size. For some algorithms, however, especially for peak – and partial
analysis and for transient detection an FFT size that is larger than the strict minimum (the power of 2 larger
than the size of the window) often significantly improves the analysis results. "

Select the FFT size in relation to the window size using the desired spectral oversampling
as the user selected feature. ...a change in the window size will automatically adapt the FFT size
keeping the oversampling nearly constant, such that the FFT approximately stays in the same relation to
the window size as initially selected."

WINDOW STEP:

The windows are overlapped to gain a better spectral picture through time. The overlap amount is the number
of samples that AudioSculpt will move through time before creating a new window. AudioSculpt is set to move 1/8
of the length of the current window size. This parameter is adjustable for the elite user.

• Using a smaller step size can be useful when working with time stretching alogrithms. The closer the windows are to one another the longer the sound file can be successfully stretched without falling into the cracks at either end of the smoothing window.

_____
AUDIOSCULPT and SDIF FORMAT

CNMAT has pioneered SDIF format with IRCAM so many MMJ-Depot patches and CNMAT softwares can work with this data structure.

Study the CNMAT spectral tutorials found in the MMJ-Depot to learn the basics of SDIF.
Visit the home of SDIF on the web to get into the gritty details: http://sdif.sourceforge.net/

Audiosculpt comes with a special utility called to extract data from SDIF files and put them into a text file. This is the easiest way to access analysis results and import them into other programs including Max.

------------------------------------------
Analysis Support:

FUNDAMENTAL ANALYSIS:

The "Fundamental Analysis. . . " item in the "Analysis" menu opens the settings panel for
"Fundamental Analysis Parameters".

• "Fundamental Minimal Frequency" in Hz : this is the minimum fundamental frequency
threshold below which a fundamental frequency will not be looked for. The default is 50
Hz.
• "Fundamental Maximal Frequency" in Hz : this is the maximum fundamental frequency
threshold above which a fundamental frequency will not be looked for. The default is 1000
Hz.
• "Maximal Frequency in Spectrum" in Hz : the analysis will not take into account spectral
peaks about this threshold. The default is 4000 Hz.
• "Noise Threshold" in dB : this specifies a noise level: if the amplitude difference between
a given peak and the highest peak is greater than this value, the peak in question will not
be taken into account.
• "Smooth Order" : frequency smoothing filter order. The default is 3.
• in addition, the selection may be kept within certain limits, or another selection may be
defined. Just check the "Restrict to Selection" box, and if applicable, type in new values in
the appropriate fields.
• "Channel to analyse" : this allows you to choose either a single channel, or all channels
(for sounds that are not mono).