Sr. Cloud Big Data Engineer

Location: McLean, VA



Ideal candidates would have working knowledge of Hadoop ecosystem and AWS ecosystem.  This includes but not limited to relational data stores, Data Integration techniques, XML, Python, Spark, ETL techniques in Hadoop and AWS ecosystem.  Projects leverage Agile methodology for enabling new business data capabilities

¿ Requirements/Critical Skills (4-6 years working experience):
– Able to work efficiently under UNIX/Linux environment
– Experience working with in-memory computing using R, Python, Spark, PySpark and Scala.
– Experience in parsing and shredding XML and JSON, shell scripting, and SQL
– Experience working with Hadoop ecosystem – HDFS, Hive, Ambari, Atlas, Ranger
– Experience working with AWS ecosystem – S3, EMR, EC2, Lambda, Cloud Formation, Cloud Watch, SNS/SQS
– Experience working with SQL and No SQL databases
– Experience with DevOps, CI/CD implementations of the following technologies: Docker or Jenkins or Test Driven Development patterns
– Experience designing and developing data sourcing routines utilizing typical data quality functions involving standardization, transformation, rationalization, linking and matching
– Knowledge of data, master data and metadata related standards, processes and technology
– Experience working with multi-Terabyte data sets (structured, semi/unstructured data)
– Knowledge of job scheduling and monitoring tools (Eg: Autosys, Airflow)

Preferred Skills:
• Experience in the Financial, and information management needs are a plus
• Experience in Financial Services/Secondary Mortgage industry is a plus
• Demonstrated flexibility and adaptability to changes – Agile methodology experience with Jira
• Demonstrated ability to manage multiple priorities and deadlines
• Ability to work independently or as part of a team
• Experience with Attunity CDC and Kafka
 
Requisition Competencies  
 
Competency  
Description  
Requisition Proficiency  
Requisition Years Experience  
Action  
 
1
2
3
100055 Agile Software Development Medium 1 - 3 Year No Action  
100775 Hadoop - Apache High 3 - 5 Year No Action  
101005 Linux High 5 - 7 Year No Action