the:behavioral:lab

Archive for the category “R”

How to email multiple MTurk workers

I frequently want to contact Mechanical Turk workers who have completed studies for me. Sometimes they say they are interested in the study and want more debriefing info. Sometimes I just want to thank them for notifying me of a typo or error that I didn’t catch. Sometimes I want to offer a follow-up opportunity to people who have participated before. It is a violation of Amazon Mechanical Turk policies to require workers to divulge their email address or other identifying information. Some people (including me) have avoided outright breaking that policy by entering the grey area of “asking” for the email address but not making HIT completion contingent on entering a valid address. Another way to contact workers would be to use the Requester User Interface. However, its difficult to A) figure out whether it is possible to contact workers through the RUI, B) figure out how to do it, and C) send emails to multiple workers, since the RUI requires you to do it one at a time.

MTurk does allow you batch process emails to workers, but only through their API. The API allows people to build their own software to utilize Mturk features — more features than what is available on the website. One such feature is sending emails, and allowing a computer to send out 500 emails as opposed to doing it manually saves a lot of time. The API requires using the .NET Framework, which is only implementable in languages I don’t have much experience with. My goal was to create a web service to allow anyone to log in and send emails. However, earlier this summer I came across an R library on GitHub that implemented many, if not all, of the API’s resources in a single package. That library has recently (9/8/14) been published on CRAN (making it easier to find and use). I want to thank Thomas Leeper for saving me the time of both learning Ruby and programming a web service in Ruby. Since Mr. Leeper (Dr. Leeper I believe, actually) implemented it in R, it won’t function as a web service, unless someone feels like setting up an R server with it. However, it is probably better this way. Using the Mturk API requires telling the program your Amazon Web Services (AWS) credentials. With those credentials someone can have full access to your entire AWS account including credit cards on Amazon Payments and thousands of other pieces of information you’d rather not let people have. Thus, security is a concern, and using the Mturk API does carry some risks that you should be willing to accept before using it.

Below is a step-by-step guide to emailing workers. It is written assuming you have no knowledge of the R programming language.

  1. Install R and Rstudio. Rstudio is technically optional, but the rest of the instructions assumes you have it.
  2. Open Rstudio.
  3. In the console (lower-left is the default location), you should see a > on the bottom. Place the cursor there and paste the following:
     install.packages("MTurkR")  
    

    then hit Enter. The console should show the steps of installing (downloading, etc.) until it says that the package was sucessfully unpacked.

  4. Create a new R script by clicking File > New File > Rscript (or Ctrl+Shift+N on PCs).
  5. Copy the following code into the script:
     library(MTurkR)  
     credentials(keypair=c("AccessKeyID goes here","SecretAccessKey goes here"))  
     contact(  
        subjects = "Email subject line goes here",  
        msgs = "Email body goes here",  
        workers = "WorkerId goes here"  
     )  
    

    #Note: The contact function above is setup to contact a single worker. You can send multiple workers the same message or customized messages to each worker in a single function call. To do so, you use vectors. For our purposes, think of a vector as a list. If you have a list of 100 email subject lines, 100 email bodies, and 100 worker IDs, R can send 100 custom emails. If you have a list of 1 email subject line, 1 email body, and 100 worker IDs, R will send the same email to 100 people. It can get more complicated, but I don’t know how it would applicable for this purpose. To create a vector, you use the c() function. Enclose all your text (e.g. worker IDs) in quotation marks and separate them with commas, like below.

     contact(  
        subjects = c("Email subject 1", "Email subject 2", "..."),  
        msgs = c("Email body 1", "Email body 2", "..."),  
        workers = c("WorkerId 1","Worker Id 2", "...")  
     )  
    
  6. In the line that starts with credentials, replace AccessKeyId goes here with your AWS access key and SecretAccessKey with your secret access key. To find your credentials, do the following:
    1.  Log in to the AWS Security Manager with the log in info for your Mturk account.
    2. Open the Access Keys panel.
    3. Click “Create New Access Keys.”
    4. Copy the Access Key and Secret Access Key and paste into the R script
      1. Note to those who know more than I do. It is my understanding that AWS no longer lets you view Secret Access Keys and you have to create a new one in order to see it. If there is a way to view a Secret Access Key without creating a new one, please email me or leave a comment.
  7. Save your R script. You will edit this file each time you want to send an email out, so put it somewhere you remember.
  8. To send an email, edit one of the two “contact” functions. The first one is for sending a single email to single worker (note you can only send emails to workers who have done work for you). The second one is for sending multiple emails (e.g. multiple emails to same worker [not recommended] or the same email to multiple workers or distinct emails to multiple workers). I’ll assume you want to send one email to multiple workers, and the instructions will edit the second contact() function.
    1. The parameter “subjects” is the subject line. You are assigning a list of subjects to that variable. If the list has only one entry, everyone will get the same subject. If you have 500 workers, you can create 500 separate subjects if you want. You just separate them by a comma. For the subject and every other item you will edit, make sure everything is enclosed within quotation marks.
    2. The parameter “msgs” is the body of the email. Similarly, you can create a list of separate email bodies or just put in one.
    3. The parameter “workers” is the WorkerId of the workers you want to contact. You can get worker IDs from the RUI if you don’t collect them yourself. Simply log into the Requester site, and go to Manage, then open a HIT posting, then download the CSV of the data. In the CSV are a lot of columns. One is labeled WorkerID. Entries in that column are what you are looking for. You can also view worker IDs in the main Review Results page. They’re in one of the default columns of results or, if they are not showing, you can click “Customize View” to have them show.
    4. Assuming you want to send one email to a list of workers, the contact function should look something like this
      1.  contact(  
            subjects = c("Email subject"),  
            msgs = c("Email body"),  
            workers = c("WorkerId 1","Worker Id 2", "...")  
         )  
        
  9. Execute the code.
    1. Place the cursor on the first line (starts with library) or highlight the entire line. Hit CTRL+Enter (for PCs) or Command+Enter (for Macs, I think). This loads the MturkR library.
    2. Place the cursor on the line that starts with credentials or highlight the entire line. Hit CTRL+Enter or Command+Enter. This stores your credentials temporarily, to use when the program communicates with Amazon’s servers.
    3. Highlight the entire contact function you wish to execute (i.e. the from the c in contact to final parentheses after all the worker ids). Hit CTRL+Enter.
    4. Look at the console to see if everything executed fine. You should see a list of workers that were notified. If some weren’t it’ll tell you why (e.g. worker ID invalid, worker hasn’t completed work for you, etc.). If you see an error, it may give you a hint as to why. For instance, perhaps you are missing an quotation mark surrounding your entries. Perhaps you are missing a comma between worker IDs.

Once everything has executed, you are finished. You can close the file, then reopen it next time, edit it again (step 8) then execute the new code (step 9). You only need to install the MturkR package once, but need to load it every time (the library command) and enter your credentials every time (the credentials function).

There are A LOT of other things the MTurkR library can do, such as setting up qualifications as well as qualifications tests, publishing and managing HITs, etc. I may tinker with other features and post my findings, but for now, MturkR is a great resource even if you only use it for contacting workers.

Advertisements

Post Navigation