- Career Center Home
- Search Jobs
- Data Curator
Description
Data Curator
The University of Pittsburgh Center for Research Computing and Data (CRCD) seeks a highly motivated and detail-oriented Data Curator to support our growing research data management and curation services and related training opportunities. This position, funded for a minimum of 3 years, will play a critical role in ensuring that our teams, particularly from social science and related disciplines, are empowered to organize and deploy research data and data-based applications that are findable, accessible, interoperable, and reusable (FAIR).
The Data Curator will be responsible for executing and continuously improving data curation workflows, working closely with researchers, training teams, and repository staff to facilitate responsible and effective use of research data.
The ideal candidate will have demonstrable experience in managing unstructured and semi-structured data, a keen understanding of the needs of social science researchers, and excellent communication skills for collaborative work. Experience with machine learning pipelines and transformer-based text processing tools is preferred.
Key Responsibilities:
- Develop, Implement, and Improve Data Curation Workflows:
- Execute and refine replicable and efficient data curation workflows for data file packaging and repository ingestion, with a special focus on unstructured and semi-structured information from social science and related research projects.
- Perform file format normalization, prepare documentation, and generate comprehensive metadata for datasets.
- Conduct quality reviews of dataset packages to ensure completeness, accuracy, and adherence to standards.
- Manage the transfer of curated dataset packages to specified institutional or external repositories.
- Organize intermediate data products and documentation to enhance the efficiency and accuracy of interactive applications and tools built upon repository data.
Document the provenance and transformations of data across model-based processing within a project where applicable.
- User-Focused Curation for Social Science Data:
- Design and carry out all curation activities with careful attention to the specific needs of users of social science datasets. This includes a focus on appropriate data types, file formats, dataset sizes, and content sensitivity.
- Ensure curated data not only meets FAIR standards and satisfies all reporting requirements (e.g., for funders, publishers) but also facilitates the responsible use of the data in downstream research, training, and educational activities.
- Address challenges related to de-identification, data privacy, and ethical considerations pertinent to social science data.
- Collaboration and Communication:
- Collaborate effectively with research and training teams within the social sciences and other disciplines to understand their data curation needs and challenges.
- Iterate on and improve data curation workflows and documentation based on feedback from researchers, data users, and other stakeholders.
- Provide consultation and training to researchers on data management best practices, data documentation, and preparation for analysis, curation, and deposit.
- Technical Implementation and Innovation:
- Contribute to the design and implementation of APIs for programmatic data access where appropriate, enhancing data discoverability and usability.
- Stay abreast of emerging trends, tools, and standards in data curation, digital preservation, and research data management, particularly as they apply to social science data.
- Participate in local and national discussions and initiatives related to research data management and curation.
The Center for Research Computing and Data (CRCD) supports leading-edge research with free access to advanced computing hardware and software for fields across the entire research community, along with training and consultation by CRCD research faculty. Under the umbrella of Pitt Research, CRCD helps shape innovation ideas into reality using methods including simulation, data analysis, image and text analysis, and genomic sequencing analysis. Whether incorporating machine learning, building humanities data resources, or improving computation, CRCD helps expand the possible.
Job Summary
Organizes and deploys research data and data-based applications that are findable, accessible, interoperable, and reusable (FAIR). Responsible for executing and continuously improving data curation workflows, working closely with researchers, training teams, and repository staff to facilitate responsible and effective use of research data.
Essential Functions
- Managing unstructured and semi-structured data
- Excellent written and verbal communication skills
- Ability to work collaboratively
Physical Effort
Position is mostly sedentary, but may require travel on campus to attend meetings or events.
Assignment Category: Full-time regular
Job Classification: Staff.Data Curator
Job Family: Research
Job Sub Family: Data Support
Campus: Pittsburgh
Minimum Education Level Required: Master's Degree
Minimum Years of Experience Required: 3
Will this position accept substitution in lieu of education or experience: No
Work Schedule: Monday-Friday, 8:30 a.m.-5:00 p.m.
Work Arrangement: Monday-Friday, 8:30 a.m.-5:00 p.m.
Hiring Range: TBD Based Upon Qualifications
Relocation_Offered: No
Visa Sponsorship Provided: No
Background Check: For position finalists, employment with the University will require successful completion of a background check
Child Protection Clearances: Not Applicable
Required Documents: Resume
Optional Documents: Cover Letter
PI277630311