In this blog we are going to learn how to you improve query performance of you Glue ETL Job by adding parallelism in your database JDBC connection. If you are running an AWS Glue ETL Job for a large database table and if you are running OOM (Out of Memory error message) because all the… Read More AWS Glue : How to perform JDBC read parallelism
Error Message :- When I was working with AWS Glue Interactive session, I got an error User arn:aws:iam::<$aws-account-id>:role/AWSGlueServiceRole-glueworkshop/GlueJobRunnerSession is not authorized to perform iam:PassRole on recsource arn:aws:iam::<$aws-account-id>:role/AWSGlueServiceRole-glueworkshop because no identify-based policy allows the iam:PassRole action. However, you may receive similar error message while working with other services too. Before we move with resolution, let’s understand… Read More How to Resolve iam:PassRole error message?
In this Video – We are going to learn What is AWS Account Alias ? Why someone need to Create AWS Account Alias ? How to create AWS Account Alias
AWS Glue – AWS Glue is a serverless ETL tool developed by AWS. It is built on top of Spark. As spark is distributed processing engine by default it creates multiple output files states with e.g. Generating a Single file You might have requirement to create single output file. In order for you to create… Read More Create single file in AWS Glue (pySpark) and store as custom file name S3
Issue – while deploying Serverless Lambda (sls deploy command ) function I got below error message An error occurred: EsLambdaFunction – The provided execution role does not have permissions to call CreateNetworkInterface on EC2 (Service: AWSLambdaInternal; Status Code: 400; Error Code: InvalidParameterValueException; Request ID: xxx). Resolution – In order to resolve the issue; I have… Read More Serverless – The provided execution role does not have permissions to call CreateNetworkInterface on EC2
Issue – Where Replication Agents job Locate Resolution – While troubleshooting replication issue; it’s always good to know which server have which job. In transnational replication scenario. Snapshot agent Log Reader agent Distribution agent If Publisher and Distribution Server is same and you have configured Push subscription; all the job reside Publisher\Distributor server. If Publisher and… Read More Replication – Agents Job Location
Issue – Replication Monitor could not open the Detail Window. Cause – When you have SQL 2012 as a Publisher and distribution as SQL 2014 and when you try to launch replication monitor through SQL 2014; you will get this error message. Resolution – Launch replication monitor through SQL 2012 i.e. Publisher
Issue – Replication Log Reader agent was failing with below error messages. 2018-04-24 19:33:07.854 Status: 32768, code: 53044, text: ‘Validating publisher’. 2018-04-24 19:33:07.873 Status: 4096, code: 20024, text: ‘Initializing’. 2018-04-24 19:33:07.873 The agent is running. Use Replication Monitor to view the details of this agent session. 2018-04-24 19:33:07.879 Status: 0, code: 20011, text: ‘The process… Read More Replication Issue – Cannot execute as the database principal because the principal “dbo” does not exist, this type of principal cannot be impersonated, or you do not have permission.
Issue – My BI developer informed me; They are no longer able to connect oracle data sources they have created in their cube. Troubleshooting – As I am aware; In order to connect to Oracle; you should have oracle client installed properly. As it was working earlier hence my first impression was; they definitely have… Read More Issues related to Oracle Client in SQL Server.
Issue – How to create ODBC connection from text file. Resolution – Step 1 – Create the text file and ensure extension is enabled. Step 2 – If you don’t know how to enable extension. select your file ( in my case odbc ) and click on view on Top Navigation bar and select File name… Read More How to create ODBC connection from text file.
Issue – How to read\write different file format in HDFS by using pyspark File Format Action Procedure example without compression text File Read sc.textFile() orders = sc.textFile(“/user/BDD/navnit/data-master/retail_db/orders”) Write rdd.saveAsTextFile() orders.saveAsTextFile(“/user/BDD/navnit/saveTextFile/orders”) sequence File Read sc.sequenceFile(ordersSF = sc.sequenceFile(‘/user/BDD/navnit/saveSequenceFile/orders’) Write PipelinedRDD.saveAsSequenceFile() ordersKV.saveAsSequenceFile(‘/user/BDD/navnit/saveSequenceFile/orders’) Avro file Read sqlContext.read.format(“com.databricks.spark.avro”).load() orders = sqlContext.read.format(“com.databricks.spark.avro”).load(“/home/BDD/navnit/orders/”) Write dataFram.write.format(“com.databricks.spark.avro”).save() orders.write.format(“com.databricks.spark.avro”).save(“/user/BDD/navnit/saveAvroFile/orders”) Parquet File Read sqlContext.read.parquet() ordersParquet =… Read More Reading\Writing Different file format in HDFS by using pyspark
Issue – Customer was getting below error when he was running any query. Msg 468, Level 16, State 9, Line 36 Cannot resolve the collation conflict between “SQL_Latin1_General_CP1_CI_AS” and “SQL_Latin1_General_CP1_CS_AS” in the equal to operation. Back ground – Table was part of replication and when he was running query on subscribe he was getting collation conflict… Read More Cannot resolve the collation conflict between “xxx” and “xxx” in the equal to operation
Issue – How to export .dtsx file from SSISDB as from SQL 2012 onward you will not see .dtsx package listed in integration services. Note – This is shortcut solution. Step 1: – Go to Integration Services Catalogs à Project Step 2: – Right click on Project à Export Step 3: – Save project file… Read More How to export SSIS package ( dtsx file) from SSISDB
Requirement – If you are planning to migrate your In House SQL Server to Amazon Cloud and wondering which option will be best for you b/w AMAZON RDS and SQL Server on EC2. Here, I have created comparison table based on AWS whitepaper and documents. Feature comparison :- Features RDS SQL on EC2 Contol AWS In house… Read More Comparison between AMAZON RDS and SQL Server on EC2