
Senior Data Engineer (Big data)
- İstanbul
- Sürekli
- Tam zamanlı
Centralized Data Repository: CPM creates a single, authoritative source oftruth for key customers. This central repository helps eliminate data silos anddiscrepancies across different systems and applications.
Data Standardization: By enforcing standard definitions, formats, and rules,CPM ensures that data is consistent across the organization. This standardizationis essential for accurate reporting, analysis, and decision-making2. Data Integration
Seamless Data Flow: In data engineering, data often comes from multiple sourceswith different formats and structures. CPM facilitates the integration of thesediverse data sets by providing a unified view of the data, making it easier tointegrate, transform, and load into data warehouses or data lakes.
Data Mapping and Transformation: CPM helps in mapping data from various sourcesto a common format. It also supports data transformation processes to ensurethat the data aligns with the master data standards.3. Data Quality Management
Data Cleansing: CPM tools often include features for data cleansing, such asdeduplication, data enrichment, and validation. This ensures that the customerprofile data is accurate and up to date, which is essential for reliable dataengineering processes.
Error Detection and Correction: CPM continuously monitors data for errors orinconsistencies and provides mechanisms to correct them, thus improving theoverall quality of data that data engineers work with.4. Governance and Compliance
Data Governance: CPM ensures that data is managed consistently across theorganization, aligning with regulatory requirements and internal policies.
Compliance: With regulations like GDPR and CCPA, CPM helps our organizationmaintain compliance by ensuring that personal and sensitive data is managedappropriately, with clear data lineage and audit trails.5. Supporting Advanced Analytics
Reliable Data for Analytics: Data engineering pipelines feed analytics andmachine learning models. CPM ensures that the data used in these processes isreliable, accurate, and consistent, leading to more trustworthy insights andpredictions.
Data Enrichment: MDM can enhance raw data with additional context or metadata,making it more valuable for advanced analytics.6. Scalability and Flexibility
Handling Large Volumes of Data: As organizations scale, the volume of dataincreases, making CPM critical in maintaining data quality and consistencyacross large, distributed systems.
Flexibility for Evolving Data Needs: CPM systems can adapt to changing businessrequirements, ensuring that the master data remains relevant and accurate evenas new data sources and types are introduced.Responsibilities:
- Responsible for developing and maintaining customer profile managementsystem, including data integration, data modeling and data migration.
- Responsible for monitoring, measuring and improving the quality of customerprofile data metrics and reducing any redundancy in the system.
- Responsible for defining and enforcing policies and procedures for managingdata assets, including product data quality, data stewardship, and referencedata security.
- Responsible for ensuring the accuracy, completeness and consistency ofcustomer profile data across the organization
- Responsible for analyzing customer profile data and data sources andidentifying trends and patterns that can be used to improve business processesand operational efficiency, includingdata-driven decision-making
- Design, develop, implement, and support our emerging big data analyticscapabilities through the development and maintenance of advanced dataingestion, processing, modeling, and reporting capabilities
- Design and implement new or enhanced data processes, tools or models
- Working with stakeholders, provide operational and process execution andongoing user support
- Write a query to do almost anything
- Process unstructured data into a form suitable for analysis – and then do theanalysis if needed
- A Bachelor’s degree or Master’s degree in computer science, informationtechnology, data engineering, software engineering, or a related field ispreferred
- Minimum 5 years of experience in a data-related role is expected for a dataengineer role
- Strong programming skills and proficiency in programming languages such as[Python, Java, Scala] are essential
- Knowledge of data engineer skills such as data management, datavisualization, and familiarity with data architecture is important
- Experience in data processing frameworks like [Apache Spark, Apache Hadoop,or Apache Flink] is beneficial for handling and processing large-scale datasets efficiently
- Database and query language expertise is preferred in database systems like[SQL, NoSQL]
- Experience working with large volumes of metadata and schemas
- Hands-on experience in designing, developing, and optimizing ETL processesusing IBM DataStage or similar ETL tools such as Informatica PowerCenter,Talend, or Apache NiFi.
- Understanding how the role complements others working with [machine learning,data science, algorithms, business intelligence] is beneficial forcollaboration
- Experience with Message Broker Systems : Kafka, etc
- Experience with data pipeline and workflow management tools: Airflow,Azkaban, Luigi, etc.
- Experience with stream-processing systems: Storm, Spark-Streaming, etc.
- Data warehousing knowledge, data modeling, cloud computing familiarity suchas [AWS, Azure], and data security and governance understanding are all essentialfor the role
- While not mandatory, data engineer certifications such as [Certified DataEngineer (CDE) or AWC Certified Big Data] are beneficial