the:behavioral:lab

Custom Timers and Time’s Up Notifications in Qualtrics

I recently had to program a study in Qualtrics that put participants under time pressure. In addition to instructions, etc. I decided I wanted to put a timer at the top of the page that was static as people scrolled through the questions (as opposed to disappearing when people starting scrolling). Additionally, I wanted to run some JavaScript after time ended. To further add pressure, when time is up, I wanted the background to flash red and white. Qualtrics has a wonderful countdown clock feature in its timing question, but to do the custom stuff I wanted to do, I needed to create my own timer.

CSS

Normally, I would start with structure (HTML), but since the HTML is made via JavaScript, I will start with the formatting. One element will contain everything (I will give it the class header-cont, short for container). I make it take up 100% of the width of its container (in this case the whole page) and set it’s position to fixed. This is what keeps the element visible even when scrolling. Next, I put it 0 pixels from the top of the page, ensuring that nothing will appear above it. Last, I set the z-index as an arbitrarily high number so that elements of the Qualtrics survey don’t overlap it.

The CSS for the other elements is mostly visual (colors, font size, etc.). It’s somewhat easy to follow if you want to change things. The header class will be for another container element, and the timer class is for the element that will contain the timer text (this can be done with fewer elements, but with this structure the timer will be more portable to other projects). This CSS should go in the Custom CSS area within the Look & Feel menu under the Advanced Tab.

 .header-cont {  
   width:100%;  
   position:fixed;  
   top:0px;  
   z-index:1000;  
 }  
 .header {  
   height:75px;  
   background:#FFFFFF;  
   width:100%;  
   margin:0px auto;  
 }  
 .timer{  
   width: 500px;  
   margin: auto;  
   text-align: center;  
   vertical-align: middle;  
   line-height: 75px;       
   font-size: 24px;  
 }  

JavaScript considerations

The timer’s functionality, including the HTML elements that structure and visually present the time are all created in JavaScript. There are a number of considerations. One major one regards the theme you choose for your survey from the Look & Feel menu. As that menu name suggests, the themes don’t just affect the look of the survey. One element of the “feel” is how the survey transitions to different pages. In some themes (I’ll call static themes), a new page of your survey requires loading a new web page. These means loading a fresh HTML document, and anything we do to change the HTML document (like add a timer) will be gone. Other themes (dynamic themes) change survey pages by dynamically loading content without loading a new HTML document. This means that when you move to the next page, your timer will remain. This creates at least four possibilities

  1. You use a static theme and want the timer only on one page.
  2. You use a dynamic theme and want the same timer to countdown across all pages after the timer starts, without starting over.
  3. You use a dynamic theme and want the timer only on one page.
  4. You use a static theme and want the same timer to countdown across all pages after the timer starts, without starting over.

Options 1 and 2 are the easiest, and require the same solution. Option 3 is slightly more difficult, but what most people will encounter (e.g. I usually use the Minimalist theme, which is dynamic. The colors, font size,etc. below work well with this theme.). I cover that second, but I encourage you to read the solution for options 1 and 2 first, since I put more explanation there. Option 4 is the most complicated, because it is hard to tell the next page that is loading what time the last page ended on/what the timer on the current page should start on. I may get to it soon. If you are interested, email me (hint: the easiest solution would save the time to an embedded data field).

JavaScript – Options 1 and 2

To create the JavaScript, I first created all the questions on the page, and added a Timing question as well. It is not required that you put this JavaScript in a Timing question, but I wanted to know the total time people spent doing the task, and all the other questions had their own JavaScript cluttering up everything.

I separated the code into logical tasks, but it should all just be copy and pasted, in order, into the JavaScript editor in Qualtrics. This first part creates the HTML elements and adds them to HTML document.

 var headerCont = document.createElement("div");  
 headerCont.className = "header-cont";  
 headerCont.id = "header_container";  
 var header = document.createElement("div");  
 header.className = "header"  
 header.id = "header_1";  
 var timer = document.createElement("div");  
 timer.className = "timer";  
 timer.id = "timer_1";  
 timer.innerHTML = "Time Remaining: <span id='time'>00:30</span>";  
 headerCont.appendChild(header);  
 header.appendChild(timer);  
 document.body.insertBefore(headerCont, document.body.firstChild);  

The first nine lines of code successively create an element and give it a class name and ID to identify them later. While, in this case, class names aren’t necessary since there is only one instance of each type of class, I use them anyways, as it’s how I would program it for larger projects. Within the timer element, I add raw HTML (which is just text and single element). The span is how the JavaScript is going to know where to put the updated timing text. The final three lines, put the header inside the header container, then the timer inside the header, and finally adds the header container (which now contains the other elements) to the body of the HTML document so it can be seen by participants. This isn’t necessary, but the last line specifically adds the element as the first element in the body. appendChild() should work equally as well for this usage, but if you didn’t want a fixed positioning and wanted the header at the top, this would be necessary.

The final step is programming a timer that updates the span I created previously, and executes a function when the timer hits 0.

 function startTimer(duration, display) {  
  var timer = duration, minutes, seconds;  
  var myTimer = setInterval(function() {  
   minutes = parseInt(timer / 60, 10)  
   seconds = parseInt(timer % 60, 10);  
   minutes = minutes < 10 ? "0" + minutes : minutes;  
   seconds = seconds < 10 ? "0" + seconds : seconds;  
   var text = ('innerText' in display)? 'innerText' : 'textContent';
   display[text] = minutes + ":" + seconds;  
   if (--timer < 0) {  
    clearInterval(myTimer);  
    timeOver();  
   }  
  }, 1000);  
 }  
 var timerSeconds = 30,  
 display = document.querySelector('#time');  
 startTimer(timerSeconds, display);  
 var timeOver = function() {  
  document.getElementById("timer_1").innerHTML = "Time is up.";  
  x = 1;  
  var bgColor = setInterval(change, 1000);  
 }  
 function change() {  
  if (x === 1) {  
   color = "red";  
   x = 2;  
  } else {  
   color = "white";  
   x = 1;  
  }  
  document.body.style.background = color;  
 }  

The startTimer() function takes two parameters. The first is how many seconds you’d like the timer to last. The second is the ID of the HTML element you’d like the timer to appear in. Within the function, the myTimer variable gets assigned an interval (a function that executes at certain time intervals indefinitely until you tell it to stop). The first 6 lines of the interval function parse the current time into a readable format (e.g. representing 4 seconds as 00:04 instead of :4 and update the text content of the timer container). The last element tests if the timer is over. If it is, it clears the interval (not necessary, but Qualtrics will get mad at you if your JavaScript includes an interval but not a clearInterval() statement), and importantly executes a new function (timeOver()). Below that initial function, three lines set the number of seconds for the time, identifies the span that contains the timer text, and activates the timer. Finally, the timeOver() function changes the text of the timer, which now should read “Time remaining: 00:00,” to “Time is Up.” and creates a new interval. The interval function (named change) changes the background color of the body element of the HTML document to red if x is 1 and white if x is 2, and also changes the value of x. The end result is the background flashing red and white to signal that they need to hurry up and finish.

JavaScript – Option 3

If you are using a theme where you need to get rid of the timer manually, there are some complications. The main issue is that when the new content is loaded the old content, including the JavaScript used to create the timer, is removed. However, intervals are still active, so functions will keep running until you tell them to stop. Telling them to stop is difficult, because since the old JavaScript is removed, you can no longer reference the functions by name. One hack would be to stop all active intervals. However, since I don’t know what intervals Qualtrics may use in the functionality of the surveys, I did not want to risk mucking up other aspects of the survey. Instead, my fix was to manually add the JavaScript for the timer inside the HTML elements I created. It sounds easy enough, but it requires all the code on a single line (which is very messy if not difficult).

I will start with the code to clear everything, since it will help when explaining changes made to the earlier code. This code can go in any question on the page immediately following the page the timer is on. If there are multiple possible pages, the code can go on all of them. If, like my project, the landing page sometimes had a timer before it, and sometimes didn’t, this will still work without error.

 if (document.contains(document.getElementById("header_container"))) {  
  if (myTimer) clearInterval(myTimer);  
  if (bgColor) clearInterval(bgColor);  
  var el = document.getElementById("header_container");  
  el.parentNode.removeChild(el);  
  document.body.removeAttribute("style")  
 }  

The flow of the code is:

  1. Check if the timer exists. If you no timer, no need to run the code.
  2. If it does exist, check if the myTimer interval is running. If so, clear it.
  3. Do the same for the bgColor interval.
  4. Find and remove the header container element (and thus everything else).
  5. Remove the style attribute from the body element (if the page was blinking red, it will revert to its default color).

Below is the JavaScript for the timer. It is similar to what’s above, but with some changes and I do not split it up by task. The main difference is where the JavaScript code for the timer goes. JavaScript runs as part of an HTML document. If JavaScript is running on a webpage, there is a script element somewhere that either contains the code or loads it from a file on a server. When you add JavaScript to a survey question in Qualtrics, it puts the script element in the header of the HTML document. It removes that element when a new survey page is loaded. I want to put some of the HTML somewhere else, so Qualtrics can’t remove it. Only I can. I do this by using JavaScript to create a script element within our other timer HTML elements. I then add text to the element. The text will be read and run as JavaScript code once the element is part of the HTML document.

 var headerCont = document.createElement("div");  
 headerCont.className = "header-cont";  
 headerCont.id = "header_container";  
 var header = document.createElement("div");  
 header.className = "header"  
 header.id = "header_1";  
 var timer = document.createElement("div");  
 timer.className = "timer";  
 timer.id = "timer_1";  
 timer.innerHTML = "Time Remaining: <span id='time'>00:30</span>";  
 var s = document.createElement('script');  
 s.type = 'text/javascript';  
 var code = 'var myTimer=false,bgColor=false;function startTimer(e,r){var t,n,i=e;myTimer=setInterval(function(){t=parseInt(i/60,10),n=parseInt(i%60,10),t=10>t?"0"+t:t,n=10>n?"0"+n:n;var e="innerText"in r?"innerText":"textContent";r[e]=t+":"+n,--i<0&&(clearInterval(myTimer),myTimer=!1,timeOver())},1e3)}function change(){1===x?(color="red",x=2):(color="white",x=1),document.body.style.background=color}var myTimer=!1,bgColor=!1,timerSeconds=30,display=document.querySelector("#time");startTimer(timerSeconds,display);var timeOver=function(){document.getElementById("timer_1").innerHTML="Time is up.",x=1,bgColor=setInterval(change,1e3)};';  
 headerCont.appendChild(header);  
 header.appendChild(timer);  
 try {  
  s.appendChild(document.createTextNode(code));  
  timer.appendChild(s);  
 } catch (e) {  
  s.text = code;  
  timer.appendChild(s);  
 }  
 document.body.insertBefore(headerCont, document.body.firstChild);  

The differences start on the line that begins with “var s = …” This line creates a script element. The next tells browsers that the element will contain JavaScript. The next contains all the modified JavaScript code for the timer. Since all the code has to be a single string, it has to be on a single line. To make this a little easier, I used a simple minify service. I’ll come back to minifying later, as well as the modified timer code in a second. The last change is the try…catch code blocks. The first block of code, which adds the JavaScript text to the script element using createTextNode, should work on most browsers, but just in case there is an error the second block of code should work.

Back to the minified code. Minifying code removes all unnecessary white space, changes some variable names to single characters, and performs some light re-writing of your code to make code smaller. This is important for sites like Google, where unminified code may be several megabytes and take a minute or more to download (and therefore a minute for a page to load), but minified code is only a few kilobytes and loads in under a second. I used it mostly to get rid of white space (putting all code on a single line), and making it slightly less messy (e.g. the line of code would extend about 50 ft off to the right of you computer screen if I put the longer, human-readable code). One word of caution about minifying, if you are going to alter the code and then reminify: while I’m sure their methods are correct, the minifier removed the first line of code (initializing the variables that will store the intervals to false). Without this line, it will be harder to check to see if the intervals are active later when clearing everything. Add that line back if needed.

  var myTimer = false, bgColor = false;  
  function startTimer(duration, display) {  
   var timer = duration,  
    minutes, seconds;  
   myTimer = setInterval(function() {  
    minutes = parseInt(timer / 60, 10)  
    seconds = parseInt(timer % 60, 10);  
    minutes = minutes < 10 ? "0" + minutes : minutes;  
    seconds = seconds < 10 ? "0" + seconds : seconds;  
    var text = ('innerText' in display) ? 'innerText' : 'textContent';  
    display[text] = minutes + ":" + seconds;  
    if (--timer < 0) {  
     clearInterval(myTimer);  
     myTimer = false;  
     timeOver();  
    }  
   }, 1000);  
  }  
  var timerSeconds = 30,  
   display = document.querySelector('#time');  
  startTimer(timerSeconds, display);  
  var timeOver = function() {  
   document.getElementById("timer_1").innerHTML = "Time is up.";  
   x = 1;  
   bgColor = setInterval(change, 1000);  
  }  
  function change() {  
   if (x === 1) {  
    color = "red";  
    x = 2;  
   } else {  
    color = "white";  
    x = 1;  
   }  
   document.body.style.background = color;  
  }  

The first line initializes two variables to false. These variables will become the interval function later. However, if the either interval is never set (which is likely for the bgColor function for some participants), then an error will be thrown on the next page when clearing the timer. Next, when the timer ends, I reset the myTimer variable to false, since the interval is already cleared. That is about it for changes (mainly because I went back and added the other changes to the Option 1 and 2 code to make it look similar). If you need to edit this code, copy and paste into your editor, make edits, re-minify, add the first line back, and paste it within the quotation marks on the proper line above.

A better way to ask for currency responses in Qualtrics

Willingness to Pay questions (WTP) are ubiquitous in marketing research (my field), as well as many others. When asking for WTP in Qualtrics, you normally have to settle for some sub-par work-arounds. When asking in an open-answer text box, you can leave no validation, but you risk getting a lot of gibberish that both isn’t quite missing data, but isn’t quite usable data either. If you put a numerical validator participants may get annoyed at the inevitable error messages. They try to type something like $300, but to a computer, that’s not a number (it contains the dollar sign so it gets interpreted as a character string). You could also do things like slider bars, but that sets artificial anchors that could bias results.

I solve this problem with regular expressions. If you are not familiar with regular expression, they are a way of programming a pattern within a string of characters. For instance, an email address is any number letters, numbers, and certain special characters, followed by the @ symbol, followed any number of strings of letter, numbers and hyphens, that are optionally separated by periods, followed by a letter string with between 2 and 4 characters (its actually a little more complicated, but that covers 99% of email addresses). In regular expression coding, that is represented by the following: ^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$

If you think it looks complicated, its because it is. After 7 years of programming, when I write regular expressions, I still use cheat sheets. Recently, I decided I was sick of using Qualtrics’ numeric validator and inserting the instruction “Only use numbers and optionally a decimal point. Do not use dollar signs or commas.” every time I asked for a response in currency format. Qualtrics allow you to create your own validations, and one option is to match the inputted text to a regular expression. With some writing, some tinkering, some forum searches, some more tinkering, and a lot testing, I settled on the following expression to validate US currency (it is not difficult to change this to other formats).

^[+-]?\$?[0-9]{1,3}(?:,?[0-9]{3})*(?:\.[0-9]{2})?$

That regular expression optionally allows the person to specify positive or negative values ([+-]?), optionally followed by a dollar sign (\$?). The next part is a little complicated. The point of it is to optionally allow a thousands separator (a comma in the US, typically). I defined that as 1 to 3 numbers ([0-9]{1,3}). Optionally followed by a comma, and if it is followed by a comma, there must be 3 numbers after the comma ( (?:,?[0-9]{3}) ). That previous pattern can repeat infinitely ( * ). Optionally, it allows decimals by requiring a period followed by exactly 2 numbers ( (?:\.[0-9]{2})? ). The ^ at the beginning says that whatever is typed has to start there. The $ at the end says whatever is typed has to end there. Both together just means that the entirety of what they type has to match what is between the two symbols. Otherwise someone could type “asdlk;fjasdf$300.00” and it would validate, because it can find a valid string within the whole of the text.

I explained it out in detail in case you are in a different country with different currency symbols and formats. To change the dollar sign, just change the first dollar sign to your currency symbol (other dollar signs have special meaning and have nothing to do with currency, so leave them). If you don’t use commas as thousands separators, but use something else, change the comma within ?:,? to your thousands separator. If you don’t use thousands separators at all, you could just leave it the way it is assuming no one would use them if they don’t know they exist, or you could remove the ?:,? entirely. If you use something besides a period as a decimal indicator (e.g. a comma) replace the \. in ?:\. with your decimal indicator (e.g. ?:,).

The regular expression above matches as many formats that I could think of.

30
$30
$30.00
3030
3,030
$3,030.00
etc.

It isn’t perfect, however. Someone could put in a value like $3000000,000, and it would match, even though with only 1 thousands separator it hard to know what the person meant. I couldn’t figure out a way to require proper thousands separators if they are used. However, this kind of problem would be so rare, that I can’t imagine ever seeing it.

To use it, click on your text entry question. In the Validation Type area of the menu on the right, select Custom Validation. The logic should read IF [your question] [theres only 1 option for the second drop down menu] [MATCHES REGEX] [^[+-]?\$?[0-9]{1,3}(?:,?[0-9]{3})*(?:\.[0-9]{2})?$]. Note the brackets are only there to separate out the different drop down and text entry menus. Do not leave the brackets in place when pasting in the regular expression.

The final piece of the puzzle is how the data is stored. You could do some javascript work to have Qualtrics save the currency as a number. However, Excel is good at noticing currency and changing it to a number, so I figured I’d do the conversion in the data cleaning phase rather than adding complicated javascript to every question.

Random thoughts: No, America doesn’t format their dates incorrectly

It’s a common criticism of America that the way we format dates is illogical. It’s so common that I now roll my eyes when I see images like this on Facebook:

If you are unaware of the criticism, the month, day, year format (M-D-Y) can be rephrased as, “Medium size unit, small size unit, large size unit.” Those outside of the US typically format dates as D-M-Y or Y-M-D, because it follows the logical progression of the sizes of the units. Days are smaller than months which are smaller than years! Our forefathers (or whoever established the M-D-Y format) must have been idiots! However, I would bet that most people can’t really argue as to why size of the unit is the best way to order the format. To provide some support for why the M-D-Y format makes sense, I offer the argument that the American format orders the units by importance of the information the unit conveys, in common usage.

How often do you write a date without the year? If you are telling someone a deadline next month, you may write it as, “Send me the file by 10/15 (October 15th).” The year is often meaningless. People can assume by context that you are talking about the current year. Similarly, people often write dates without the day of the month. “Yes, we’re getting married in 10/15 (October of 2015).” Again, the context tells the listener that they are giving a general time, and the day of the month doesn’t matter. Now, how many times have you seen a date without the month, such as “I’m moving on the 10th of the month in 2015?” The only instance I can think of is when giving a single unit of time (e.g. “Get it back to me by the 15th” or “We’re getting married in 2015”). However, in these cases formatting doesn’t really matter since there is only one format. The logical conclusion is that across the several formats of dates, it makes sense to always put the month first, since it is the only constant piece of information.

The units also carry different amounts of information. For instance, if you remove yourself from a specific time, days of the month carry virtually no information. Saying “the 15th of the month” is virtually meaningless outside of where that is relative to the current date (e.g. “that’s 2 weeks from now”). The 1st of the month is associated with rent payments, etc., but I would argue that such information is used infrequently. Months, however, include information about the seasons and other activities that recur yearly. If you are in school, August or September are probably associated with going back to school. December is associated with holidays. When you are thinking about celebrating someone’s birthday, the month generally carries the most important information. Sure day of the month is necessary, and the year tells you their age, but the month helps you determine how far away it is, when to start planning, who else to group the birthday with (e.g. if you have a single office birthday party once a month), etc.

I, of course, am not saying MM-DD-YYYY is the best format. Instead, as Howard Moskowitz may say, “There is no best format — only best formats.” When dealing with recurring events, the month is often the most important piece of information, and learning that information first makes sense. However, the year is often very important, so the international standard of Y-M-D makes the most sense in many cases (especially on computers where it is the easiest to sort by date). Really, though, I’m not even advocating for M-D-Y’s normativity. Instead, I argue that when people don’t understand why something is the way it is (especially if that thing differs from what you expect), it is better to put some thought into figuring it out then to assume everyone is dumb.

How to email multiple MTurk workers

I frequently want to contact Mechanical Turk workers who have completed studies for me. Sometimes they say they are interested in the study and want more debriefing info. Sometimes I just want to thank them for notifying me of a typo or error that I didn’t catch. Sometimes I want to offer a follow-up opportunity to people who have participated before. It is a violation of Amazon Mechanical Turk policies to require workers to divulge their email address or other identifying information. Some people (including me) have avoided outright breaking that policy by entering the grey area of “asking” for the email address but not making HIT completion contingent on entering a valid address. Another way to contact workers would be to use the Requester User Interface. However, its difficult to A) figure out whether it is possible to contact workers through the RUI, B) figure out how to do it, and C) send emails to multiple workers, since the RUI requires you to do it one at a time.

MTurk does allow you batch process emails to workers, but only through their API. The API allows people to build their own software to utilize Mturk features — more features than what is available on the website. One such feature is sending emails, and allowing a computer to send out 500 emails as opposed to doing it manually saves a lot of time. The API requires using the .NET Framework, which is only implementable in languages I don’t have much experience with. My goal was to create a web service to allow anyone to log in and send emails. However, earlier this summer I came across an R library on GitHub that implemented many, if not all, of the API’s resources in a single package. That library has recently (9/8/14) been published on CRAN (making it easier to find and use). I want to thank Thomas Leeper for saving me the time of both learning Ruby and programming a web service in Ruby. Since Mr. Leeper (Dr. Leeper I believe, actually) implemented it in R, it won’t function as a web service, unless someone feels like setting up an R server with it. However, it is probably better this way. Using the Mturk API requires telling the program your Amazon Web Services (AWS) credentials. With those credentials someone can have full access to your entire AWS account including credit cards on Amazon Payments and thousands of other pieces of information you’d rather not let people have. Thus, security is a concern, and using the Mturk API does carry some risks that you should be willing to accept before using it.

Below is a step-by-step guide to emailing workers. It is written assuming you have no knowledge of the R programming language.

  1. Install R and Rstudio. Rstudio is technically optional, but the rest of the instructions assumes you have it.
  2. Open Rstudio.
  3. In the console (lower-left is the default location), you should see a > on the bottom. Place the cursor there and paste the following:
     install.packages("MTurkR")  
    

    then hit Enter. The console should show the steps of installing (downloading, etc.) until it says that the package was sucessfully unpacked.

  4. Create a new R script by clicking File > New File > Rscript (or Ctrl+Shift+N on PCs).
  5. Copy the following code into the script:
     library(MTurkR)  
     credentials(keypair=c("AccessKeyID goes here","SecretAccessKey goes here"))  
     contact(  
        subjects = "Email subject line goes here",  
        msgs = "Email body goes here",  
        workers = "WorkerId goes here"  
     )  
    

    #Note: The contact function above is setup to contact a single worker. You can send multiple workers the same message or customized messages to each worker in a single function call. To do so, you use vectors. For our purposes, think of a vector as a list. If you have a list of 100 email subject lines, 100 email bodies, and 100 worker IDs, R can send 100 custom emails. If you have a list of 1 email subject line, 1 email body, and 100 worker IDs, R will send the same email to 100 people. It can get more complicated, but I don’t know how it would applicable for this purpose. To create a vector, you use the c() function. Enclose all your text (e.g. worker IDs) in quotation marks and separate them with commas, like below.

     contact(  
        subjects = c("Email subject 1", "Email subject 2", "..."),  
        msgs = c("Email body 1", "Email body 2", "..."),  
        workers = c("WorkerId 1","Worker Id 2", "...")  
     )  
    
  6. In the line that starts with credentials, replace AccessKeyId goes here with your AWS access key and SecretAccessKey with your secret access key. To find your credentials, do the following:
    1.  Log in to the AWS Security Manager with the log in info for your Mturk account.
    2. Open the Access Keys panel.
    3. Click “Create New Access Keys.”
    4. Copy the Access Key and Secret Access Key and paste into the R script
      1. Note to those who know more than I do. It is my understanding that AWS no longer lets you view Secret Access Keys and you have to create a new one in order to see it. If there is a way to view a Secret Access Key without creating a new one, please email me or leave a comment.
  7. Save your R script. You will edit this file each time you want to send an email out, so put it somewhere you remember.
  8. To send an email, edit one of the two “contact” functions. The first one is for sending a single email to single worker (note you can only send emails to workers who have done work for you). The second one is for sending multiple emails (e.g. multiple emails to same worker [not recommended] or the same email to multiple workers or distinct emails to multiple workers). I’ll assume you want to send one email to multiple workers, and the instructions will edit the second contact() function.
    1. The parameter “subjects” is the subject line. You are assigning a list of subjects to that variable. If the list has only one entry, everyone will get the same subject. If you have 500 workers, you can create 500 separate subjects if you want. You just separate them by a comma. For the subject and every other item you will edit, make sure everything is enclosed within quotation marks.
    2. The parameter “msgs” is the body of the email. Similarly, you can create a list of separate email bodies or just put in one.
    3. The parameter “workers” is the WorkerId of the workers you want to contact. You can get worker IDs from the RUI if you don’t collect them yourself. Simply log into the Requester site, and go to Manage, then open a HIT posting, then download the CSV of the data. In the CSV are a lot of columns. One is labeled WorkerID. Entries in that column are what you are looking for. You can also view worker IDs in the main Review Results page. They’re in one of the default columns of results or, if they are not showing, you can click “Customize View” to have them show.
    4. Assuming you want to send one email to a list of workers, the contact function should look something like this
      1.  contact(  
            subjects = c("Email subject"),  
            msgs = c("Email body"),  
            workers = c("WorkerId 1","Worker Id 2", "...")  
         )  
        
  9. Execute the code.
    1. Place the cursor on the first line (starts with library) or highlight the entire line. Hit CTRL+Enter (for PCs) or Command+Enter (for Macs, I think). This loads the MturkR library.
    2. Place the cursor on the line that starts with credentials or highlight the entire line. Hit CTRL+Enter or Command+Enter. This stores your credentials temporarily, to use when the program communicates with Amazon’s servers.
    3. Highlight the entire contact function you wish to execute (i.e. the from the c in contact to final parentheses after all the worker ids). Hit CTRL+Enter.
    4. Look at the console to see if everything executed fine. You should see a list of workers that were notified. If some weren’t it’ll tell you why (e.g. worker ID invalid, worker hasn’t completed work for you, etc.). If you see an error, it may give you a hint as to why. For instance, perhaps you are missing an quotation mark surrounding your entries. Perhaps you are missing a comma between worker IDs.

Once everything has executed, you are finished. You can close the file, then reopen it next time, edit it again (step 8) then execute the new code (step 9). You only need to install the MturkR package once, but need to load it every time (the library command) and enter your credentials every time (the credentials function).

There are A LOT of other things the MTurkR library can do, such as setting up qualifications as well as qualifications tests, publishing and managing HITs, etc. I may tinker with other features and post my findings, but for now, MturkR is a great resource even if you only use it for contacting workers.

Preventing Mturk workers from prematurely submitting HITs (by altering the submit button’s functionality)

If you’ve ever used the Mturk web interface for creating HITs, you know it’s lacking in several areas. One that can be particularly annoying is the submit button that gets added to the end of every template you make. It looks like a plain HTML submit button, and is a half inch away from your normal content. I used to receive emails constantly from people who accidentally submitted their HIT before completing it. Never one to cry over 25 cents, I approved them, but answering those emails took too much time.

My initial solution was simple. I put about 10 blank lines at the end of my HIT template, following by a horizontal bar, followed by instructions to only click the submit button once the HIT is complete. That stopped about 99% of the emails. Which to me was a success. This is still all I do for my HITs, and if you are having this problem, just add the HTML code below to the end of your HITs.

 <p>&nbsp;</p>  
 <p>&nbsp;</p>  
 <p>&nbsp;</p>  
 <p>&nbsp;</p>  
 <p>&nbsp;</p>  
 <p>&nbsp;</p>  
 <p>&nbsp;</p>  
 <p>______________________________________________________________________________________</p>  
 <p>After you have completed everything in this HIT, click the submit button below.</p>  

However, a friend recently wanted something more foolproof. I can’t remember the exact reasoning, but my friend wanted the submit button gone entirely. If you need 100% assurance (good luck you’ll never get it), you can try the method below. The logic is to first hide the submit button, which is a simple CSS modification. Once you want the user to submit the HIT, you can “click” the submit button for them by using the JavaScript click() function. What’s in the middle is really up to you. The code below replaces the submit button with a different button. That button queries a database which records when workers complete tasks within HITs. If a task is complete but the HIT is not submitted, the script submits the HIT. If the task is not completed, it alerts the user that they have not finished the task yet. This naturally assumes you have a database that records this information. Likely you do not. However, if you use confirmation codes or secret end-of-survey-passwords you can use JavaScript to check if the passcode is correct or not. If it is incorrect, they are either trying to cheat your out of 25 cents, or forgot the passcode (which is, of course, a problem).

 //Requires jquery. If you want to do the selection and the AJAX by hand, feel free, but its much easier for me to just implement Jquery.  
 $(document).ready(function(){  
   //Hide submit button  
   $("#submitButton").hide();  
   //Create new submit button with validator function when clicked  
   $("<button/>").attr({'id':'newSubmit','type':'button'}).text("Check Submission").click(function(){  
    //Get location, which includes the worker ID, create an object to hold query string variables, and parse the query string  
       var href=window.location.href.toString();  
    var qs={};  
    href.replace(new RegExp("([^?=&]+)(=([^&]*))?", "g"),function($0, $1, $2, $3){qs[$1] = $3;});  
    //The url below would be whatever script you use to query the database. The query string includes the worker ID and the surveyId,  
       //so you know who completed what. The callback=? is jquery's method of doing JSON with padding, which is necessary due to the   
       //cross-domain scripting (since this will be implemented in mturk)  
       var url="https://www.test.org/mturk/getinfo.php?workerId="+qs['wid']+"&surveyId="+surveyId+"&callback=?";  
    $.getJSON(url,function(rtnData){  
                if(rtnData.completed==1)  
                {  
                     //Listed as completed in the database  
                     $('#submitButton').click();  
                }  
                else  
                {  
                     //Listed as something other than completed  
                     alert("Our databases indicate you have not completed the survey yet. Please make sure you have completed every portion of the HIT before submitting.");  
                }  
           });  
   }).insertAfter("#submitButton");  
 });  

As I mentioned, there are a lot of problems with this kind of implementation. You’re risking screwing over a good worker to save yourself 50 cents, since unless the person is good at manipulating your HTML directly in their browser console (which if you didn’t know is very easy to do) they cannot submit the HIT. One quick fix for that is to replace the line of code toward the end with a confirm() message instead of an alert() message. The confirm message would tell the user that it looks like they haven’t completed the HIT yet, but they can submit anyway if they want to. That code would look like this:

 //Replace the line of code with the alert() function with these two lines of code.  
 var sub=confirm("Our databases indicate you have not completed the survey yet. Please make sure you have completed every portion of the HIT before submitting. If you think this is an error, and you have completed the HIT, click OK to submit the HIT. Click Cancel to continue working on the HIT.");  
 if(sub) $('#submitButton').click();  

That’s about all I can say about preventing workers from submitting HITs before they complete the task. Again, in my experience this is non-problem. Simple HTML formatting fixes it. If someone is nefarious, there is not a lot you can do to stop him or her from stealing your payment other than rejecting workers after the fact. My philosophy is provide the best experience I can for the workers, and I suppose adding the confirm() option isn’t a bad idea. I’ll leave with an alternative function that validates based on a password you give your workers at the end of your survey. It uses the confirm() not the alert() method since I think completely preventing someone from submitting a HIT will do more harm than good.

 //To use this function, copy it and replace the the function that is currently an argument in the click() function in the code above  
 function(){  
      //Someone can look up the password in the console, since they can see the Javascript, but its doubtful. Again, nothings ever foolproof.  
      var password = "PUT YOUR PASSWORD HERE";  
      //Get password entered by user. The code assumes your text box which the user puts the password in has the id: password_input  
      var pass = $('#password_input').val();  
      //If nothing was entered, alert user and exit function  
      if(pass=="") {alert("Please enter the password given to you in the survey before sumbitting the HIT.');return;}  
      //Create Boolean variable to test whether to submit or not  
      var submit;  
      pass==password ? submit=true : submit=confirm('The password you entered does not match the password displayed at the end of the survey. Please make sure the survey is 100% complete before submitting the HIT. If you would like to submit anyway, click OK. However, you run the risk of having your work rejected. If you would like to continue working on the HIT, click Cancel.');  
      if(submit) $('#submitButton').click();  
 }  

The Rise and Fall of Kitchen Gadgets

As I’m sure the greats before me did, I get a lot of my research inspiration while shopping at Goodwill. Recently, I purchased a replacement cover for my ice cream maker when I was surprised to see my exact model sitting on a shelf. A week later there were 2 of them. At another store there were 3. All. The. Same. Model. Why? I tested one out. It spun. That’s all an ice cream maker does really. The cold part isn’t electric and can’t really break. Well it turns out the Cuisinart ICE-20 was insanely popular back in 2009. In 2013, apparently not so much. People are literally giving them away. My wife commented that they are the new bread machine. Everyone has one, one day. Everyone donates it to Goodwill the next.

Ice cream makers aren’t the first kitchen gadgets to have a meteoric rise and fall. Like I just said, bread machines were huge in the 90s. Electric knives made quite a splash in 1981. Fondue sets were famously over-popular in the 1970’s. All these still exist today, but are hardly considered must-have. So what caused their popularity and subsequent rejection? Impossible to answer. All of these occurred during food fads that required their use, but who know which causes which (the fad or the popularity). Also, all these examples reflect what Alton Brown calls uni-taskers. They are gadgets that are meant for doing one thing, and can’t really do much else. You can try to argue that by that definition, a knife is a unitasker since it only cuts, but you’d be wrong. Knives also stab. Seriously though, you can cut so many things in so many different ways, that it doesn’t really qualify. Ice cream makers on the other hand really just make ice cream. Sorbet is doable, but that’s just ice cream without the fat. Products that do only one thing will tend to have a sharp but short period of popularity.

The examples above also reflect consumers’ desire to make their lives simpler or cheaper. Making food at home tends to be cheaper than going out. Dumping flour and water into a machine and out pops bread 2 hours later is far easier than my 15 hour long sourdough bread procedure (way better though… way better). However, the gadgets may not offer much help. Ice cream makers do something you can’t do otherwise, but the machine rarely gets cold enough, affecting quality. And the process of making the custard to put into the machine is pretty difficult itself. Bread machines are bulky, hard to put away, hard to get back out, and no one wants it left on the counter-top taking up nearly an acre of counter-real-estate. Thus, products that are meant to make consumer’s live easier that really don’t will also suffer from uni-tasker syndrome.

After those hypotheses, I am left with one question. Why do people actually get rid of them? Is it just a Spring cleaning thing, or do people get rid of fad gadgets faster than they discard other items? My guess is yes, but this will have to born out through data.

So what will be the next item I can walk to Goodwill and pay 1/10 the original price for a slightly used (or maybe untouched, still in the box) kitchen gadget? Sodastream comes to mind. It only makes soda. It saves you at most like 50 cents up against a steep start-up investment. The only thing missing is the bizarre fad.

Don’t Hate the Requester; Hate the Game: An economic dilettante’s take on Mturk ethics

Maybe they are just not very vocal, but I’ve never come across an Amazon Mechanical Turk (mturk) worker  who lauds the business practices of mturk. In my experience as both a requester and worker on Amazon Mechanical Turk (mturk), I find ethical problems revolve around two primary concerns: 1) Workers are paid pittance and 2) Requesters have free range to reject work without justification. The second problem is hard to address. It basically comes down to “who watches the watchers.” Tools like TurkOpticon put some power into the worker’s hands, but probably not enough. However, it also seems to not be as large of a problem. People may occasionally get shafted, but in the big picture the effects seem to be small compared to the low payment rate (not to say it doesn’t suck though).

The low payment rate is an interesting moral question because it is somewhat unique, at least to those involved. Opponents argue that a sizable percentage use mturk as a primary or secondary income, and they should be earning a living wage. Proponents argue that mturk was not meant to used that way. Unlike the Walmart wage debate, no requester is trying to hire a worker even at part-time. They would say they are merely putting a task out there. If it pays too little then no one will accept it; forcing a higher payment rate would simply mean they would go elsewhere to go their data. Everyone loses. This reasoning is not infallible, and the rhetorical arguments can go on forever.

Instead, I have been thinking more about the economic argument. I’ve covered about 2 weeks worth of game theory in my economic principles of marketing course (essentially a whirlwind microecon 1 and 2 course covered in like 5 weeks). So naturally, I am now an expert ready to analyze important, broad, real-world topics. (Side note: if you didn’t catch that, I’m not an expert and don’t claim that what I write below is at all the way a world-class economist would analyze the situation… hell, I may have even made a wrong assumption. If you can do better, join the discussion).

At the most shallow level, mturk, like all employee-employer relationships, is like a Prisoners’ Dilemma game. The incentives in place push for workers to not work hard, and push for employers to pay low wages. In the table below, workers can choose to work or loaf (i.e. give junk work), and requesters choose to pay a high wage or a low wage. Each party chooses a strategy at the same time with perfect knowledge of the other person’s strategy. By perfect knowledge, I mean that both the employer and employee know the payoffs to each party. Thus, they know what the other person will do given the strategy they pick.

Work Loaf
High wage v-h, h-a -h, h
Low wage v-l, l-a -l, l

The first amount is the outcome for the requester, after the comma is the outcome to the worker. v=value of work, h=high wage, l=low wage, a=effort of working.

The equilibrium of this game (i.e. the point where neither player has an incentive to deviate from that strategy) is for the worker to loaf and the requester to pay a low wage. This is not how mturk operates, but it provides a baseline to analyze against. The low wages seem to exist, but the loafing does not. If it did, no one was use it for data purposes. However, these are the things people worry about because without checks the prisoners’ dilemma, where everyone is worse off, would occur.

Something that differentiates mturk is the ability for requesters to reject workers submissions and not pay them. Forgetting that requesters can be evil and reject everyone for no reason, we’ll assume that requesters have sufficient reason and have to put some effort into checking. In this case, workers can still work or loaf, but requesters can pay high and check, pay high and not check, pay low and check, and pay low and not check. Additionally, loafing has costs as well (it takes some time to complete a HIT with bad work, but not as much as it takes to complete a HIT with good work — hence d < a). They were not included before because they did not affect the equilibrium. Last, checking has costs as well (e.g. time taken to add an attention filter to a survey or get correct answers to compare to, research validity costs of making workers think you are watching them, time taken to actually check and reject bad work, etc.). This cost will be reflected in the cells where the requester checked if the worker loafed or not.

Work Loaf
High wage, check v-h-c, h-a -c, -d
High wage, not check v-h, h-a -h, h-d
Low wage, check v-l-c, l-a -c, -d
Low wage, not check v-l, l-a -l, l-d

Outcomes are in the same order. v, h, l, and a are the same. c=cost of checking, d=cost of loafing; d<a, c<v

Here there is no pure strategy Nash equilibrium (i.e. at no point in the table do both sides have no incentive to choose a different strategy). If the worker works, the requester’s strategy is to pay low and not check (that cell has the highest payoff to the requester). However, if the requester is going to do that (which the worker is aware of given perfect knowledge), then the worker would loaf since they do less work and get the same amount of money. Knowing that is the case, the requester would check thus not paying the worker. Knowing the requester would check, the worker would work. Finally knowing that, the requester wouldn’t check anymore. This unending cycle means each party’s strategy is chosen probabilistically (i.e. a mixed strategy Nash equilibrium) to maximize payoffs given beliefs about how likely an actor is to choose a particular strategy. This explains why some workers loaf and some work very hard. The loafers are playing a probability that they will be paid even though they did poor work. That probability exists because the requester is also playing a probability. Instead of checking work for each HIT, they only check some. Thus they save on the costs of checking, and provide some incentive to give good work. It is possible to work out these probabilities as a function of the costs and payments involved, but instead going through all that math, I will make two reasonable assumptions to change the game to make it slightly easier to solve.

First, the cost of checking is rather low, especially for research experiments. Adding an attention filter requires little effort, and most people have a research assistant updating credit (thus there is no time component). Treating c as zero means the check and not check rows in each wage level pay the same to the requester (v-h or v-l). Second, the cost of not working is sometimes as high as the cost of working. The amount of time saved is often negligible, and some people (those not doing it as a primary or secondary income) find the cognitive challenge of some research studies stimulating (that is the studies that don’t kill you with boredom), making the cost of doing good work less. Thus if we treat a and d as equal they can also be removed from the columns. This last assumption is less believable, but it also may not affect the equilibrium in a meaningful way.

After these assumptions are made, a pure strategy equilibrium exists (i.e. one of the cells is clearly preferred, and people do not choose probabilistically anymore). That strategy is for workers to work and for requesters to pay a low wage and check. This equilibrium somewhat reflects the state of mturk. Most workers do good work. Most requesters check to make sure they do and 99% of HITs pay below the minimum wage if computed at an hourly rate. One interesting point is after the assumption stated above are made, the low wage-check and low wage-no check cells are identical. This means that while all the actors can assume requesters will check. Requesters may not check, and only they would know (since information about checking is only known to workers if they do not work). After talking with lots of other researchers who use mturk, this reflects the state of the game. They check when its easy to do so, but find it isn’t always necessary and often do not check.

Recall that two hazards existed in mturk. Workers can turn in poor work, and requesters can pay a low wage. Absent the ability to reject HITs by checking the work of workers, a Prisoner’s Dilemma existed where workers turned in poor work in exchange for a small wage. When the ability to reject was introduced, the optimal strategy for workers was to turn in good work. Rejection thus solves for one of the hazards of mturk. This ends up being beneficial for both parties since workers get paid more on average and requesters get a high value from the data received.

However, low wages still exist. For those who do mturk for fun, that might not be such a bad thing. There is some intrinsic reward, as I mentioned earlier, and it may be better than other ways to waste time. However, given the large number of people who use Mturk as a source of income, the ethical dilemma still exists. What can be done to push the equilibrium to the high wage cells? If costs of checking were different for the two levels (i.e. if for some reason it was cheaper to check the work when paying high wages compared to low wages), it may be enough make the low wage payoff lower. However, it is unclear if this is feasible. A better possibility is to have different values of work for low wage and high wage. If the value of v in the high wage were, say, double that of v in the low wage cells, v-h-c may be greater than v-l-c. What does that look like in reality. 1) If you can help it, don’t complete under paid HITs. If it takes 4 times longer to get a HIT done at 25 cents than 50 cents, people will pay 50 cents. This is because the value (v) at that low wage is much lower given that people usually want their HITs completed quickly. 2) Hopefully this happens naturally, and not purposefully, but poorer data (i.e. more noise, more loafers, etc.) that comes from cheap HITs is likely to cause higher payments. Again, the value of a cheap HIT becomes low because the data isn’t usable or it takes time to weed through the poor responses. This could backfire though by causing requesters to look elsewhere for data collection needs. 3) Really focus when you complete a high paying HIT. Confirm the researchers thoughts that if they pay better they get better data (this is a generally accepted assumption). I’ve seen some respondents who complete a $5 HIT that takes 10 minutes in under a minute solely because (I assume) they think there is a chance itll sneak by. Seeing data get worse as cost goes up instantly lowers the payment rate (i.e. the value of work at high payment rates becomes lower than the value of work at low payment rates). The example above is very isolated however.

This simple economic analysis of mturk tells me that payment levels are where they are due to the incentive structure of mturk. Rejection prevents workers from not delivering good information, but giving good information regardless of payment causes low wages. Enforcing a type of minimum wage on HITs likely would push requesters out of the market, and return them to labs and expensive national online panel companies. Thus the primary way workers can help boost the wage level is to not complete low-wage HITs and ensure high-wage HITs are completed well. Seemingly obvious conclusions I know, but it was fun figuring out the game.

Social Consequences of Unforced Compliance: Reluctantly Joining Social Media

Twitter was always for overshare-ers to me. Instagram was wannabe creative-types without effort or talent. Klout? The attention whore’s measuring stick. One week ago, I started using all of those. After 7 or so days of being an oversharing, wannabe creative-type, attention whore I’ve noticed interesting behavior and changes in attitudes.

My social media awakening was less abrupt than I let on. I had a Friendster and LiveJournal since 2003. I was also an early MySpace and Facebook adopter. The purpose of these tools was to connect personally with friends and family. Facebook still serves that purpose to me. I live far from virtually everyone I’ve ever known, and it’s nice to know they exist. Facebook is not an external tool to me though. When a friends account becomes a marketing extension of the company they work for, the prejudice (and pride) run deep.

My current thrust was spurred on by a Professor’s (Pete McGraw) repeated discussions about the importance of external visibility for research. Projects used to be important based on citation counts. In the future, real-world impact is likely to be of much greater importance. This wasn’t entirely new to me, but my thinking previously was, “Well, I’ll be super impactful when I come up with my popular audience book idea.”

In the span of about five minutes my thoughts went from, “I should probably update my blog more regularly” to “Maybe also dust off the dormant Twitter account and broadcast the blog better.” Then the ball started rolling really fast. “I guess I should look at my Klout score and see if improving it leads to more people reading the blog.” “Instagram impacts Klout a lot? I said never, but since the app is free let’s go for it.”

Festinger’s Cognitive Consequences of Forced Compliance basically proposed that being forced to commit an act, given certain conditions, changes attitudes to be in line with the behavior. Since my behavior wasn’t forced, did my attitudes change in those few minutes which changed my behavior or does voluntarily changing behavior lead to the same (or stronger) attitude change compared to unforced compliance?

I don’t really desire to answer that question. All I know is the attitudes certainly changed. I enjoy having additional channels to interact with friends. I enjoy cultivating a professional online persona. I enjoy seeing a larger readership of my blog.

Self-Reflective Things I’ve Learned Thus Far

  • I am old. I realize this is a very cliched joke to make when you in your twenties and comparing yourself to teenagers, so that’s not really what I am doing. Instead, I’m making the slightly less cliched joke of pointing out when you are becoming everything you made fun of your parents about. Learning the norms of social networks is difficult. Sometimes I have these out of body experiences where I’m looking at myself like I’m watching a child learning to ice skate (or a grown 27 year old man who grew up in New England and never learned to skate learn to skate… blog pending). Luckily I’m cognizant of this, so I’m not too grandpa-on-facebook-like.
  • Instagram is not nearly as douchey as I had always thought. I’ve always felt like I don’t document important events enough (note this is nothing like the recent research in over-documenting… I don’t even have pictures from my honeymoon up the California coast), and Instagram provides a way to do that with double the positive reinforcement — friends connect, Klout score increases. Side note: Headlines like this one recent posted on Instagram’s twitter account will always make me roll my eyes: “Top 10 Photographers [really???] on Instagram”
  • Interestingly, after joining Instagram and Klout, I felt the need to make fun of that fact. Internally, I apparently couldn’t accept the statement “I’m joining Instagram.” It had to be something like, “Even I think I’m crazy and stupid, but I’m joining Instagram anyway.” This is probably already documented, but the need for some semblance of attitudinal or behavioral consistency even when you are performing a completely inconsistent action seems like an interesting phenomenon (assuming its not just me).

How to randomize or shuffle an array in Qualtrics

Qualtrics does many things right. However, its vast capabilities sometimes makes me think it can do things that it can’t. Unlike SurveyMonkey where I just assume it can’t do anything, sometimes I think Qualtrics can do everything. Luckily, Qualtrics’ JavaScript capabilities makes it so if you know some coding, you can do a lot of the things you thought were impossible.

Randomizing arrays of numbers is something Qualtrics can’t do (easily) without JavaScript. Technically, you can create a randomizer, inside the randomizer create X branch elements, set each element to automatically occur (previously set an embedded data element named a, set the value to 1, then set the branches to occur if a=1), create an embdedded data element in each branch, and set the randomizer to randomly show X elements evenly. Pretty difficult, and even that won’t do everything. You would need to add in however you implement the random number or use some piped text code to add that number to a different element to you can end up with a full array after the randomizer ends. Regardless it is difficult.

In Javascript the code is this (here it is on Github):

1:  function shuffleArray(array) {  
2:    for (var i = array.length - 1; i > 0; i--) {  
3:      var j = Math.floor(Math.random() * (i + 1));  
4:      var temp = array[i];  
5:      array[i] = array[j];  
6:      array[j] = temp;  
7:    }  
8:    return array;  
9:  }  

In a question in Qualtrics, you simply insert this code into the javascript editor. One way or another create your array (type it in, grab it from the question text, etc…. if you need help doing this let me know in the comments), shuffle it using this code, then one way or another use it. You can set the array to an embedded data element using the SurveyEngine.setEmbeddedData() function that is already available in the Quatlrics API (mentioned here). You can add it as text to a question (I think that function is mentioned in the previous link, if not its another SurveyEngine function). Again, the implementation options are infinite. If you’ve gotten this far, and don’t know how to use your randomized array, again, let me know in the comments.

Changing Mturk submit button functionality & A new way to prevent duplicate workers on separate HITs

Previously, I’ve posted some simple ways to prevent workers from completing your mturk HITs because they already completed an identical one last week or last month, etc. (here, here, and here). Sometimes keeping track of long lists of mturk workers who have completed your previous HITs is difficult. Editing, copying, and pasting a list of 1000 workers that took your survey last time and can’t take it this time takes a while and is prone to mistakes. The best method I can suggest when this is difficult is to start databasing your workers. This is how I manage my Mturk participants, but it requires a decent amount of knowledge of JavaScript, PHP, and MySQL. If that is not your cup of tea, I have been trying to work on an especially simple method using cookies, though it is by definition imperfect.

Cookies are simply small bits of data that a webpage can store on your computer, then later read back. It’s how website “remember” that you are logged in, for instance. Using a small amount of JavaScript, you can set a cookie that is named after your study. The existence of the cookie with that particular name means that the worker completed that study already. When you post a study later and want to exclude people who took your first study, your HIT will have some JavaScript in it that searches for that cookie you created earlier. If it exists, the person took your first study and is excluded, if it doesn’t, they can take your new study. There is no copy and pasting, and no long files full of worker IDs.

However, as I said, it is imperfect. Some users turn off cookies, preventing you from setting the cookie. Some users clear their cookies regularly, deleting the cookie you created. Cookies are computer based, not user based. So a user can go to a different computer and still take your new study. Also, a second person can try taking your study on the same computer and be prevented from doing so because the cookie applies to anyone using that computer, not the specific person. Given the large amount of mturk workers, and the base-rates for the problems I just mentioned, I would guess that this method would prevent about 90% of workers who you don’t want taking your studies from doing so. It may be more or less. The key factors are how many people regularly clear their cookies or complete HITs on multiple computers.

Here is the commented code you can copy and paste into your HIT template window. It first loads jQuery, then adds an event listener to the Submit button, then checks if a cookie exists with the same name as the cookie that gets created when the submit button is clicked. All you need to change is the studyName value to something unique for your study. However, what happens on line 20 is up to you. right now it simply alerts the user that they have taken your study before. Someone more useful would be to prevent any other content from loading and telling them to not accept/return the HIT. Since exactly how it gets implemented can change for HIT to HIT, I didn’t not specify a procedure beyond alerting the user. The cookies are set to last a maximum of 1 year. You can hypothetically make it longer, but likely people have deleted their cookies by then anyway.

Github code here

1:  <script src="https://code.jquery.com/jquery-1.10.1.min.js"></script>  
2:  <script type="text/javascript">  
3:  //this code assumes jQuery is implemented, though it would not be difficult to do this with straight javascript. .ready() assures the code runs after the submit button is added  
4:  $(document).ready(function () {  
5:    var studyName="dm102513";//replace this value with a unique, acceptable cookie name... no spaces and stick with alphanumerics and underscores  
6:    $("#submitButton").click(function(){  
7:        var exdate=new Date();  
8:        exdate.setDate(exdate.getDate() + 365);  
9:     document.cookie=studyName+"=1; expires="+exdate.toUTCString()+"; domain=mturkcontent.com; path=/";//When the user clicks the Submit button, it creates a cookie that you can read later as having already completed this survey  
10:    });  
11:    var i,x,y,ARRcookies=document.cookie.split(";");//variables necessary to read the cookies. Last variables separates all available cookie names into an array we will loop through to find our cookie  
12:    for (i=0;i<ARRcookies.length;i++){  
13:     x=ARRcookies[i].substr(0,ARRcookies[i].indexOf("="));//x=cookie name  
14:     y=ARRcookies[i].substr(ARRcookies[i].indexOf("=")+1);//y=cookie value  
15:     x=x.replace(/^\s+|\s+$/g,"");//Remove white space  
16:     if (x==studyName && y==1){  
17:             //If your cookie exists AND the value is 1, they have taken your survey before.  
18:             //Here you would put code that somehow prevents them from doing anything for your HIT, like not loading content etc.  
19:             //However, for our purposes here, I just put a simple alert in. Change this to whatever code works for you.  
20:       alert("You are not eligible for this HIT because you have already completed an identical HIT.");  
21:     }  
22:    }  
23:  });  
24:  </script>  

Post Navigation

Follow

Get every new post delivered to your Inbox.

Join 138 other followers