ProQuest TDM Studio: Data Visualizations

TDM Studio is ProQuest’s text and data mining platform. This platform enables researchers to create datasets using licensed ProQuest content and analyze those datasets by running Python or R scripts in an accompanying Jupyter Notebook. A component of TDM Studio is Data Visualizations, a platform for researchers with little to no coding experience. Data Visualizations is accessible to any member of the Mason community.

The first feature of Data Visualizations is geographic analysis. Geographic analysis maps articles based on locations named within the articles. All articles come from publications GMU subscribes to through ProQuest, and includes such titles as the New York Times, Washington Post, and Los Angeles Times. Users can create up to five unique projects, and each project can contain up to 10,000 articles. Other methodologies, including topic modeling, sentiment analysis, and ngram/term frequency, are currently in development, as is an export data functionality.

Accessing ProQuest TDM Studio: Data Visualizations

  1. Go to tdmstudio.proquest.com/home.
  2. Click log in to TDM Studio.
  3. Click create an account. Enter your GMU email address; read through the terms of use, and check the box if you consent; click create account.
  4. You will receive an email to your GMU account. Click the link in the email to confirm the creation of your account.
  5. Begin using Data Visualizations.

Using ProQuest TDM Studio: Data Visualizations

  1. Once you’re logged in, you will be directed to the visualization dashboard. From the dashboard you are able to manage and interact with your projects using analysis methods. You can create up to five projects. Click create new project to begin.
  2. In the search box, enter your search terms. The following publications will be searched for the specified terms: Chicago Tribune; Washington Post; Wall Street Journal; New York Times; The Globe and Mail; Los Angeles Times; The Guardian; Sydney Morning Herald; South China Morning Post; and Times of India.
  3. You will refine the dataset. Only 10,000 documents can be analyzed per project. You can refine your search by publication; date published; and document type. Click next: review project when finished.
  4.  You will be given a summary of your project that includes document count; publications; and the selected analysis. Enter a descriptive project name and click create project.
  5. A dialog box will appear that says your project was successfully submitted. Once you close the message you will be redirected to the visualization dashboard.
  6. Your project will take time to generate. You will see the name; date range; search query; count; publications; and analysis method listed for your newly created project. Once the project has been successfully generated, you can click show actions. Click delete if you wish to delete the project. Click open geographic visualization to open the visualization.
  7. The geographic visualization will open with a global view. The top menu includes the project name; the number of articles in the project; date range; and an option to export data.
  8. Click on a cluster or map marker. A drawer will open on the right hand side of the screen that includes the articles included in that cluster or marker. The larger the cluster, the longer it will take for the drawer to open. If you click on the title of an article, a new tab or window will open and you will be directed to the article itself. Click hide articles to close the drawer.
  9. You can use the slider along the bottom of the map to change the date range. The map will update to reflect the new date range.
  10. Later updates to the platform will enable users to export data.

Want to try out Data Visualizations? Follow the link listed above, or you can find Data Visualizations on the Libraries’ A-Z Database list.

If you have questions about Data Visualizations or TDM Studio, contact the Digital Scholarship Center (DiSC) at datahelp@gmu.edu.

Love Data Week Workshops Feb 10-Feb 14

To celebrate Love Data Week (February 10 – February 14, 2020) the Digital Scholarship Center (DiSC) will be running a series of workshops focusing on getting started with R, python, GIS, text analysis, using secondary data, and managing data projects. The workshops will take place in the DiSC Lab, Room 2701A Fenwick Library. All are welcome to attend these workshops regardless of skill level. Registration is strongly encouraged. Click on the time links below to register.

Working With and Analyzing Secondary Data – Monday, February 10, 1:00 PM and 5:00 PM
Using Voyant for Text Analysis – Tuesday, February 11, 1:00 PM and 5:00 PM
R/Python: How and Why to Get Started – Wednesday, February 12, 1:00 PM and 5:00 PM
OSF 101: Introduction to the Open Science Framework – Thursday, February 13, 1:00 PM and 5:00 PM
Introduction to GIS and Mapping – Friday, February 14 at 1:00 PM

On Monday, February 10 at 1 PM and 5 PM, Wendy Mann will lead a workshop on Working With and Analyzing Secondary Data. She will discuss how to acquire, review, and analyze secondary data. Participants will learn how to prepare this kind of data for analysis and bring it into a statistical package. Reviewing datasets and documentation will also be covered.

Learn how to Use Voyant for Text Analysis on Tuesday, February 11 at 1 PM and 5 PM. Alyssa Fahringer will provide an overview of Voyant, showcase projects that utilize the platform, and discuss use cases. She will walk attendees through how to upload and explore a corpus in Voyant as well as how to embed and export your data.

On Wednesday, February 12 at 1 PM and 5 PM, Debby Kermer will go over How and Why to Get Started with R and Python. She will cover when those languages should be used, what to know about them prior to getting started, and resources for learning them. The final half hour of the workshop will be devoted to answering questions and assisting with software installation and hands-on learning.

Come for an Introduction to the Open Science Framework on Thursday, February 13 at 1 PM and 5 PM. Margaret Lam and Carl Leak will discuss how to navigate and create projects on the Open Science Framework (OSF). Attendees will learn how to reproduce research practices, track activity, and use Templates and Forks in OSF to make new projects.

On Friday, January 14 at 1 PM, Joy Suh will lead an Introduction to GIS and Mapping. Participants will learn the basics of visualizing geographic information and creating maps in a GIS. She will talk about how to understand geospatial data, where to find mapping source data, and how to use ESRI ArcGIS. Additionally, attendees will learn how to read and interpret maps and data, and how to use cartographic principles to create maps for presentations and publications.