Difference between revisions of "Guide to lab computing"
Line 3: | Line 3: | ||
If you work with Python (or any programming language) over an extended period of time you will find that your old projects no longer work in the same environment as your newer projects. The language itself evolves over time, as do the library dependencies you import into your projects. As you keep up with the latest changes your older scripts tend to break. |
If you work with Python (or any programming language) over an extended period of time you will find that your old projects no longer work in the same environment as your newer projects. The language itself evolves over time, as do the library dependencies you import into your projects. As you keep up with the latest changes your older scripts tend to break. |
||
− | The solution to this problem is to [https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments. |
+ | The solution to this problem is to [https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html create independent environments for your projects]. Doing so helps to ensure the long-term reproducibility of your code, and it also makes it easier to collaborate with other researchers. Keep your old environment definition around for your old project, and use a new environment with updated packages for your new project. |
Here's how to get started with environments with Anaconda Python: |
Here's how to get started with environments with Anaconda Python: |
Revision as of 15:53, 11 April 2022
Reproducible Python environments
If you work with Python (or any programming language) over an extended period of time you will find that your old projects no longer work in the same environment as your newer projects. The language itself evolves over time, as do the library dependencies you import into your projects. As you keep up with the latest changes your older scripts tend to break.
The solution to this problem is to create independent environments for your projects. Doing so helps to ensure the long-term reproducibility of your code, and it also makes it easier to collaborate with other researchers. Keep your old environment definition around for your old project, and use a new environment with updated packages for your new project.
Here's how to get started with environments with Anaconda Python:
- Install miniconda instead of using the full Anaconda installer. This gives you the Anaconda base tools and nothing else (not even Python).
- (Optional and recommended) Make
conda-forge
your default package channel. This channel is a community-created source of many useful packages that tends to be a little more comprehensive and up-to-date than Anaconda's default package list.- Set
conda-forge
as the highest priority channel:conda config --add channels conda-forge
- Activate
strict
channel priority:conda config --set channel_priority strict
- Set
- Create an environment for your project. This is where you install Python, Jupyter, and whatever additional packages you need. It's best to have a minimal base environment and do all of your work in a non-base environment. You might find that you can install everything you need cleanly in a single environment, but it's easy to create additional environments if you need to. This is useful to avoid package conflicts, or if you want to make your workflow for a single project easily repeatable and shared.
Whenever possible, install packages using conda. If you need something that is not available in the default channel, the conda-forge channel is a good source of additional (and more up-to-date) scientific software. Fall back to pip only in the event you can't find a conda package. As a last resort you can download and install some software by hand, e.g. audiolabel, which has no alternative installation method.
Printing
The Lab printer is a Xerox Phaser 3250 and is located in room 50.
For troubleshooting see the printer manual.
The Berkeley Phonetics Machine
The Berkeley Phonetics Machine is a virtual machine with phonetic software preinstalled.
Sample scripts and snippets
- get_dur -- a very simple script for reading label durations from a Praat textgrid
- Python notebooks for reading Praat textgrids and performing formant analysis on vowel tokens
- output formatting in Python -- a Python snippet for creating readable and maintainable output format and header strings in your scripts
sox
cookbook for phonetics -- not exactly a script; a page describing scriptable ways to usesox
that are useful for phoneticians
ffmpeg
reference -- a reference page describing scriptable ways to useffmpeg
for creating video stimuli
- simplerec.osexp -- a simple audio recording experiment for OpenSesame
- multi_align -- a script for running pyalign on an audio file based on labelled regions of a textgrid. Pull it into the BPM with 'sudo bpm-update ucblingmisc'.
Tools and libraries
The tools and libraries listed here are available on the department server. Some may also be available for other platforms.
Local tools
- ifcformant -- a command line tool for extracting formant measurements, as described in Ueda, Yuichi; Hamakawa, Tomoya; Sakata, Tadashi; Hario, Syota Hario; Watanabe, Akira (2007) A real-time formant tracker based on the inverse filter control method, Acoustical Science and Technology of the Acoustical Science of Japan 28(4), 271-4. We are grateful to Yuichi Ueda for providing the C code which implements the algorithm. The user interface is provided by a Python wrapper around the authors' C code and was written by Ronald Sprouse.
Lab members can contact Ronald Sprouse for copies of ifcformant compiled for OS X, Windows, or Linux systems. Unfortunately we do not have permission to distribute the C code or compiled versions of this tool to the public.
For detailed usage information, run:ifcformant --help
- convertlabel -- a command line tool for converting between Praat textgrids, ESPS label files, and Wavesurfer label files. You can also scale or shift timepoints in the label file by a specified amount. Written by Ronald Sprouse.
For detailed usage information, run:convertlabel --help
- concat_pyalign_textgrids -- a command line tool for concatenating Praat TextGrids. Written by Keith Johnson.
- Klatt synthesizer -- a speech synthesizer originally written by Dennis Klatt.
- ultracomm -- a command line tool for configuring and acquiring ultrasound data from an Ultrasonix Tablet system.
- ultrasession.py -- a Python script for running ultracomm and simultaneously acquiring audio and ultrasound synchronization signals.
Local libraries
- audiolabel -- a Python library for reading and writing Praat textgrids, ESPS label files, Wavesurfer label files, and time-aligned tabular data. Special access methods for retrieving labels at specified times or by matching label content. Written by Ronald Sprouse, and available on [github]. See meas_formants for a sample script that uses this library. The audiolabel_demo walks you through many of the steps executed in
meas_formants
.
- SoundLabel.pm -- a Perl library for reading and writing Praat textgrids, ESPS label files, and Wavesurfer label files, written by Ronald Sprouse. Old and clunky API. You are encouraged to write scripts that use audiolabel instead.
Handy third-party tools
- pyalign -- a command line tool for automatically aligning phones to an audio file based on an orthographic transcription of the audio.
- ffmpeg -- a command line tool for transcoding video and audio. See the ffmpeg reference page for tips on how to use it.
- reaper -- a command line tool for calculating F0 from an audio file
- sox -- 'the Swiss Army knife of sound processing programs'; a command line tool for audio processing. See the sox in phonetic research page for sample usages.