Difference between revisions of "Web-based perception experiments"

From Phonlab
Jump to navigationJump to search
m
 
(33 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Experiments can be hosted on the Department of Linguistics server, and then distributed to listeners by email link, or via Amazon Mechanical Turk.
+
Experiments can be hosted at the [https://www.ocf.berkeley.edu/ Open Computing Facility] or on the Department of Linguistics server, and then distributed to listeners by email link, or via [[Amazon Mechanical Turk]].
   
An example is here: [http://linguistics.berkeley.edu/~kjohnson/talkers/memory.php?list=test example experiment]
+
An example is here: [http://linguistics.berkeley.edu/~kjohnson/talkers/unique.php?list=test example experiment]
  +
  +
Get a zipfile of everything you would need to run a web-based experiment in this format: [http://linguistics.berkeley.edu/~kjohnson/talkers/example_experiment.zip zip of example experiment] The "readme.txt" file in this archive details the different components used in the experiment. Some of this is also documented here.
  +
  +
------------
   
 
A Javascript library [http://linguistics.berkeley.edu/~kjohnson/talkers/js/audexp.js audexp.js] makes it realatively easy to implement the following four typical kinds of experiments:
 
A Javascript library [http://linguistics.berkeley.edu/~kjohnson/talkers/js/audexp.js audexp.js] makes it realatively easy to implement the following four typical kinds of experiments:
Line 10: Line 14:
 
* Contrast Rating (cr) - two audio files are played, and a rating number is given.
 
* Contrast Rating (cr) - two audio files are played, and a rating number is given.
   
  +
-------------
  +
===Your html code must have four features:===
   
  +
1) after loading audexp.js, load a javascript file that defines:
Your html code must have four features:
 
  +
* an array (or two arrays) of filenames that will be presented.
 
  +
* a variable called 'block'
* (1) after loading audexp.js, load a javascript file that defines
 
** an array (or two arrays) of filenames that will be presented.
 
** a variable called 'block'
 
 
** here's an [http://linguistics.berkeley.edu/~kjohnson/talkers/js/blocktest_list.js example] of such a file
 
** here's an [http://linguistics.berkeley.edu/~kjohnson/talkers/js/blocktest_list.js example] of such a file
 
** this .js file was created with a small [http://linguistics.berkeley.edu/~kjohnson/talkers/js/make_wordlist.prl perl script] from a .csv spreadsheet.
 
** this .js file was created with a small [http://linguistics.berkeley.edu/~kjohnson/talkers/js/make_wordlist.prl perl script] from a .csv spreadsheet.
** the html header contains these two lines to get the audexp.js library and the experiment-specific list of sound files.
+
** the html header contains the following two lines to get the audexp.js library and the experiment-specific list of sound files.
   
 
<script src="js/audexp.js"></script>
 
<script src="js/audexp.js"></script>
 
<script src="js/blocktest_list.js"></script>
 
<script src="js/blocktest_list.js"></script>
   
  +
In the experiment linked above, the link is to a php script which dynamically constructs the html code that is delivered the user's web browser. The word 'test' is passed to the php script in the URL: http://linguistics.berkeley.edu/~kjohnson/rating_exp.php?list=test
* (2) a call to load the experiment when the page is loaded
 
** this call specifies the type of experiment (in this case 'id')
 
** whether to randomize the order of presentation of the list of sound files
 
** the interstimulus interval for 'ax' and 'cr' types
 
** and the intertrial interval for all experiments.
 
   
  +
The list variable is read from the php $_GET variable:
  +
<?php
  +
function safe_get_list_param($maxlen) {
  +
if (isset($_GET['list']) && preg_match("/^[a-zA-Z0-9\.-_]{1,$maxlen}$/", $_GET['list'])) {
  +
$val = $_GET['list'];
  +
} else {
  +
throw new Exception("didn't find 'list' parameter.");
  +
exit();
  +
}
  +
return $val;
  +
}
  +
$block = safe_get_list_param(20);
  +
?>
   
<body onload="load('id',false,500,2000);">
 
   
  +
In the php script, the javascript lines above are generated with an embedded bit of php code (as below). This makes it possible for me to deploy different experiments (that just differ in the list of sound files that will be played), using the same php script. The script constructs different list file names on the fly from the "list" variable that is given in the URL.
* (3) Three or for <span...> elements that will be used to give feedback to listeners.
 
<pre><span id="wr"></span> is used to show warnings
 
<span id="f1">sound 1</span> is used to indicate that an audio file is playing
 
<span id="f2">sound 2</span> is used to indicate that an audio file is playing
 
<span id="key">#</span> is used to indicate which key the listener pressed
 
</pre>
 
   
  +
<script src="js/audexp.js"></script>
* (4) A <form ...> element named "dataform"
 
  +
<script src="js/block<?php echo $block; ?>_list.js"></script>
**the order of the input items determines the column order in the output file>
 
  +
<form method="POST" id="dataform" action="process.php?p=id&n=1">
 
  +
By the way, here is another small perl script that can be used to create mp3 copies of all of the sound files in a directory tree [http://linguistics.berkeley.edu/~kjohnson/talkers/make_mp3s make_mp3s]. Internet Explorer requires MP3 format audio (at the time of this writing).
  +
  +
2) a call to load the experiment when the page is loaded
  +
* this call specifies the type of experiment (in the script below this is 'id')
  +
* whether to randomize the order of presentation of the list of sound files
  +
* and the intertrial interval for all experiments.
  +
* the interstimulus interval for 'ax' and 'cr' types (for very precise control of ISI, it is recommended that you create sound files that have both of your stimuli separated by a silence of your chosen duration). Some users experience delays as long as 2 or 3 seconds while a file loads from our server to their desktop. This 'load time' is reported in the data file, and the reaction time measurement does not include the load time.
  +
  +
  +
<body onload="load('id',false,2000,500);">
  +
  +
3) Three or four ''span'' elements that will be used to give feedback to listeners.
  +
* Show warnings to the listener
  +
<pre><span id="wr"></span></pre>
  +
  +
* indicate that an audio file is playing
  +
<pre><span id="f1">sound 1</span></pre>
  +
  +
* indicate that a second audio file is playing (for ax, and cr type experiments)
  +
<pre><span id="f2">sound 2</span></pre>
  +
  +
* give feedback to the lister, showing which key they pressed.
  +
<pre><span id="key">#</span></pre>
  +
  +
4) A <form ...> element named "dataform"
  +
*the names of these input items are very strict - and must match the names expected in process.php.
  +
  +
<form method="POST" id="dataform" action="process.php?n=unique">
 
<input type="hidden" name="subject" value=<?php echo $subj; ?> />
 
<input type="hidden" name="subject" value=<?php echo $subj; ?> />
 
<input type="hidden" name="trial" />
 
<input type="hidden" name="trial" />
Line 53: Line 88:
 
</form>
 
</form>
   
  +
The experiment php script linked at the top of this page generated a unique subject number for the user with a little php code:
  +
<?php
  +
$block = safe_get_list_param(20);
  +
$subj = uniqid($block);
  +
?>
  +
This subject number will start with the word or number that you passed to the script as 'list' in the url (http:\\linguistics.berkeley.edu\~kjohnson\rate_exp.php?list=test).
  +
  +
----------------
  +
===Saving data to a server file: process.php===
  +
  +
Finally, in (4) above you may have noticed there was reference to a file: process.php. There is probably no reason to modify this file, but anyway here it is, with some discussion of what it does.
  +
  +
<?php
  +
  +
function safe_post_param($p, $maxlen) {
  +
if (isset($_POST[$p]) && preg_match("/^[a-zA-Z0-9 \.-_]{1,$maxlen}$/", $_POST[$p])) {
  +
$val = $_POST[$p];
  +
} else { $val = '<invalid>'; }
  +
return $val;
  +
}
  +
  +
* this function checks that data coming in is safe to store
  +
  +
if (isset($_GET['n']) && preg_match("/^\w{1,20}$/", $_GET['n'])) {
  +
$n = $_GET['n'];
  +
$incfile = 'ep_' . $n . '.inc';
  +
$success = include_once($incfile);
  +
if (! $success) {
  +
throw new Exception("didn't open the include file.");
  +
}
  +
} else {
  +
throw new Exception("didn't find GET parameter.");
  +
exit();
  +
}
  +
  +
*This portion of process.php constructs a filename from the parameter that you pass to it. The parameter is ?n=xxx and the script constructs a file name "ep_xxx.inc" and then "includes" that file so the script now knows the value of a hidden variable called $datafile that contains the name of the file where you will save the experiment data.
  +
** for example, if ep_xxx.inc contains $datafile="id_data1.csv" then your data will be stored, one line per button press response, into the .csv file.
  +
** the include file also has a line" $type='r'; This variable is used to decide if we should store the name of a second sound file, if the experimental paradigm ('id', 'ax', 'r', 'cr') involves playing two sound files with an interstimulus interval.
  +
  +
  +
$form_params['subject'] = 20;
  +
$form_params['trial'] = 5;
  +
$form_params['list'] = 5;
  +
$form_params['file1'] = 20;
  +
if ($type=='ax' || $type =='cr') {$form_params['file2'] = 20; }
  +
$form_params['filedur'] = 10;
  +
$form_params['loadtime'] = 10;
  +
$form_params['mystatus'] = 20;
  +
$form_params['response'] = 10;
  +
$form_params['rt'] = 10;
  +
  +
*The <input..> elements of "dataform" (# 4 above) are now saved as columns in the data file.
  +
** process.php checks the names of the input elements in the form, and only accepts data from these form elements. The values are the maximum number of characters that can be taken from each input item.
  +
  +
if ($_SERVER["REQUEST_METHOD"] == "POST") {
  +
$formdata = [];
  +
foreach($form_params as $p=>$maxlen) {
  +
array_push($formdata, safe_post_param($p, $maxlen));
  +
}
  +
$data = join(",", $formdata) . "\n";
  +
$ret = file_put_contents($datafile,$data,FILE_APPEND| LOCK_EX);
  +
if (! $ret) {
  +
throw new Exception("error on file_put_contents()");
  +
}
  +
}
  +
  +
?>
  +
  +
*The data file must already exist on the server
  +
** This script adds lines, but does not create a new file
  +
** The data file must have access privileges that let the php script write to the file (see Ronald Sprouse about how to set these privileges).
  +
  +
------------
  +
===Reading your data into R ===
  +
  +
* You can read the data file directly into R from your server address:
  +
<pre>data <- read.csv("https://linguistics.berkeley.edu/~kjohnson/talkers/id_data.csv",header=TRUE);</pre>
   
  +
* [https://linguistics.berkeley.edu/~kjohnson/talkers/verify_turk_rating_data.R verify_turk_rating_data.R] shows an example of a script that reads rating data and judges whether the participant did a good job. This type of quick look at the data is useful in the process of approving work on MTurk.
Finally, in (4) above you may have noticed there was reference to a file: process.php.
 
[http://linguistics.berkeley.edu/~kjohnson/talkers/process.php process.php]
 

Latest revision as of 12:38, 29 July 2022

Experiments can be hosted at the Open Computing Facility or on the Department of Linguistics server, and then distributed to listeners by email link, or via Amazon Mechanical Turk.

An example is here: example experiment

Get a zipfile of everything you would need to run a web-based experiment in this format: zip of example experiment The "readme.txt" file in this archive details the different components used in the experiment. Some of this is also documented here.


A Javascript library audexp.js makes it realatively easy to implement the following four typical kinds of experiments:

  • Identification (id) - a single audio file is played, and a two-alternative forced choice (2AFC) is given.
  • Discrimination (ax) - two audio files are played, and a 2AFC is given.
  • Rating (r) - a single audio file is played, and a rating number (from 1 to 7) is given.
  • Contrast Rating (cr) - two audio files are played, and a rating number is given.

Your html code must have four features:

1) after loading audexp.js, load a javascript file that defines:

  • an array (or two arrays) of filenames that will be presented.
  • a variable called 'block'
    • here's an example of such a file
    • this .js file was created with a small perl script from a .csv spreadsheet.
    • the html header contains the following two lines to get the audexp.js library and the experiment-specific list of sound files.
 <script src="js/audexp.js"></script>
 <script src="js/blocktest_list.js"></script>

In the experiment linked above, the link is to a php script which dynamically constructs the html code that is delivered the user's web browser. The word 'test' is passed to the php script in the URL: http://linguistics.berkeley.edu/~kjohnson/rating_exp.php?list=test

The list variable is read from the php $_GET variable:

<?php
 function safe_get_list_param($maxlen) {
   if (isset($_GET['list']) && preg_match("/^[a-zA-Z0-9\.-_]{1,$maxlen}$/", $_GET['list'])) {
     $val = $_GET['list']; 
   } else {
     throw new Exception("didn't find 'list' parameter.");
     exit();
   }
   return $val;
  }
 $block = safe_get_list_param(20);
?>


In the php script, the javascript lines above are generated with an embedded bit of php code (as below). This makes it possible for me to deploy different experiments (that just differ in the list of sound files that will be played), using the same php script. The script constructs different list file names on the fly from the "list" variable that is given in the URL.

<script src="js/audexp.js"></script>
<script src="js/block<?php echo $block; ?>_list.js"></script>

By the way, here is another small perl script that can be used to create mp3 copies of all of the sound files in a directory tree make_mp3s. Internet Explorer requires MP3 format audio (at the time of this writing).

2) a call to load the experiment when the page is loaded

  • this call specifies the type of experiment (in the script below this is 'id')
  • whether to randomize the order of presentation of the list of sound files
  • and the intertrial interval for all experiments.
  • the interstimulus interval for 'ax' and 'cr' types (for very precise control of ISI, it is recommended that you create sound files that have both of your stimuli separated by a silence of your chosen duration). Some users experience delays as long as 2 or 3 seconds while a file loads from our server to their desktop. This 'load time' is reported in the data file, and the reaction time measurement does not include the load time.


 <body onload="load('id',false,2000,500);">

3) Three or four span elements that will be used to give feedback to listeners.

  • Show warnings to the listener
<span id="wr"></span>
  • indicate that an audio file is playing
<span id="f1">sound 1</span>
  • indicate that a second audio file is playing (for ax, and cr type experiments)
<span id="f2">sound 2</span>
  • give feedback to the lister, showing which key they pressed.
<span id="key">#</span>

4) A <form ...> element named "dataform"

  • the names of these input items are very strict - and must match the names expected in process.php.
<form method="POST" id="dataform" action="process.php?n=unique">
     <input type="hidden" name="subject" value=<?php echo $subj; ?> />
     <input type="hidden" name="trial" />
     <input type="hidden" name="list" />
     <input type="hidden" name="file1" />
     <input type="hidden" name="filedur" />
     <input type="hidden" name="mystatus" />
     <input type="hidden" name="loadtime" />
     <input type="hidden" name="response" />
     <input type="hidden" name="rt" />
</form>

The experiment php script linked at the top of this page generated a unique subject number for the user with a little php code:

<?php
   $block = safe_get_list_param(20);
   $subj = uniqid($block);
?>

This subject number will start with the word or number that you passed to the script as 'list' in the url (http:\\linguistics.berkeley.edu\~kjohnson\rate_exp.php?list=test).


Saving data to a server file: process.php

Finally, in (4) above you may have noticed there was reference to a file: process.php. There is probably no reason to modify this file, but anyway here it is, with some discussion of what it does.

<?php
function safe_post_param($p, $maxlen) {
 if (isset($_POST[$p]) && preg_match("/^[a-zA-Z0-9 \.-_]{1,$maxlen}$/", $_POST[$p])) {
   $val = $_POST[$p]; 
 } else { $val = '<invalid>'; }
 return $val;
}
  • this function checks that data coming in is safe to store
if (isset($_GET['n']) && preg_match("/^\w{1,20}$/", $_GET['n'])) {
 $n = $_GET['n'];
 $incfile = 'ep_' . $n . '.inc';
 $success = include_once($incfile);
 if (! $success) {
    throw new Exception("didn't open the include file.");
 }
} else {
 throw new Exception("didn't find GET parameter.");
 exit();
} 
  • This portion of process.php constructs a filename from the parameter that you pass to it. The parameter is ?n=xxx and the script constructs a file name "ep_xxx.inc" and then "includes" that file so the script now knows the value of a hidden variable called $datafile that contains the name of the file where you will save the experiment data.
    • for example, if ep_xxx.inc contains $datafile="id_data1.csv" then your data will be stored, one line per button press response, into the .csv file.
    • the include file also has a line" $type='r'; This variable is used to decide if we should store the name of a second sound file, if the experimental paradigm ('id', 'ax', 'r', 'cr') involves playing two sound files with an interstimulus interval.


$form_params['subject'] = 20;
$form_params['trial'] = 5;
$form_params['list'] = 5;
$form_params['file1'] = 20;
if ($type=='ax' || $type =='cr') {$form_params['file2'] = 20; }
$form_params['filedur'] = 10;
$form_params['loadtime'] = 10;
$form_params['mystatus'] = 20;
$form_params['response'] = 10;
$form_params['rt'] = 10;
  • The <input..> elements of "dataform" (# 4 above) are now saved as columns in the data file.
    • process.php checks the names of the input elements in the form, and only accepts data from these form elements. The values are the maximum number of characters that can be taken from each input item.
if ($_SERVER["REQUEST_METHOD"] == "POST") {
 $formdata = [];
 foreach($form_params as $p=>$maxlen) {
     array_push($formdata, safe_post_param($p, $maxlen));
 }
 $data = join(",", $formdata) . "\n";
 $ret = file_put_contents($datafile,$data,FILE_APPEND| LOCK_EX);
 if (! $ret) {
    throw new Exception("error on file_put_contents()");
 }
} 
?>
  • The data file must already exist on the server
    • This script adds lines, but does not create a new file
    • The data file must have access privileges that let the php script write to the file (see Ronald Sprouse about how to set these privileges).

Reading your data into R

  • You can read the data file directly into R from your server address:
data <- read.csv("https://linguistics.berkeley.edu/~kjohnson/talkers/id_data.csv",header=TRUE);
  • verify_turk_rating_data.R shows an example of a script that reads rating data and judges whether the participant did a good job. This type of quick look at the data is useful in the process of approving work on MTurk.