Senior Big Data Developer
Location: McLean, VA
• Cleanse, manipulate and analyze large datasets (Structured and Unstructured data – XMLs, JSONs, PDFs) using Hadoop platform.
• Develop Python, PySpark, Spark scripts to filter/cleanse/map/aggregate data.
• Be able to build Dashboards in R/Shiny for end user consumption
• Manage and implement data processes (Data Quality reports)
• Develop data profiling, deduping logic, matching logic for analysis
• Programming Languages experience in Python, PySpark and Spark for data ingestion
• Programming experience in BigData platform using Hadoop platform
• Present ideas and recommendations on Hadoop and other technologies best use to management
• 5+ years of experience in processing large volumes and variety of data (Structured and unstructured data, writing code for parallel processing, XMLS, JSONs, PDFs)
• 3+ years of programming experience in Hadoop, Spark, Python for data processing and analysis.
• Strong SQL experience is a must
• 3+ years of experience – using Hadoop platform and performing analysis. Familiarity with Hadoop cluster environment and configurations for resource management for analysis work
• Ability to work in a UNIX environment
• Detail oriented. Excellent communication skills (verbal and written)
• Must be able to manage multiple priorities and meet deadlines
• Degree in Computer Science, Statistics, Economics, Business, Mathematics or related field
Comments for Suppliers: