Bridges between Excel and SQL

I have a good friend who loves using excel. I want to encourage her to learn some SQL and database queries. In this post, I will share the most important bridge from my known/familiar territory to my new territory. My “secret” bridge is GOOGLE. With the correct google search term set (utilizing one’s known knowledge), google will give answers that applicable to the unfamiliar knowledge. Little by little, and piece by piece, the knowledge of unfamiliar field become familiar and known. Today, the known territory is “Excel”. The new territory is “SQL”.

Continue reading “Bridges between Excel and SQL”

When SQL Meets Spark 1 –connect PySpark to SQL Server

Today, I will start a new series of blogs about Spark. Same as in the Series of Python, I will share my journey of becoming a big data developer from a SQL developer. One year ago, in one of my favorite local SQL Server meeting, I suddenly got the aha moment. From there, I have run in a fast track on big data coding. Now, I am the lead developer of one of our project based on data in Hadoop. I will share my commonly used codes so that you can start your journey faster.

Continue reading “When SQL Meets Spark 1 –connect PySpark to SQL Server”

When Python Meets SQL–part 5: read/write multiple sheets from one excel file

After I finished re-testing this code, I suddenly realized that SQL connection and SQL commands did not appear. I am happy that this post in only name as part 5. The previous four posts all have SQL inside. This one is just to show how flexible pandas could be. When I had an excel file with several tabs and the data structure on every tab are same, this code was used to save time and to avoid mistakes:

Continue reading “When Python Meets SQL–part 5: read/write multiple sheets from one excel file”

When Python Meets SQL-part 3: load excel file data into SQL Server without creating table

At the beginning of my Python ETL journey, I created tables in SQL server and insert to those tables. Back then, I thought this is the only way. During my work using pySpark, I used pySpark to write SQL tables from pySpark dataframe. The best thing is I don’t need to create tables pySpark does all for me. (I will wrote pySpark codes later). That good experience makes me think: is there a better way in python? Is there a way to write data into SQL server without creating table?

My best friend google told me: yes. And here is how:

Continue reading “When Python Meets SQL-part 3: load excel file data into SQL Server without creating table”

When Python Meets SQL –part 2: loop to ETL multiple tables

In part 1 of this series, I mentioned loop after I failed to fast load data to sql server in my test code with my personal computer. In this post, I will show the advantage of using loop in python to move multiple tables/queries. The beauty of learning and using python is we can skip many mouse clicks in SQL server Import and Export Wizard. Let’s begin.

Continue reading “When Python Meets SQL –part 2: loop to ETL multiple tables”

When Python Meets SQL–part 1: move data between two SQL servers with fast load

I will organize several code that I have used in my daily work to do simple ETLs. This post will be first one: how to move table (or query result from one SQL server) to another. Even though the task could be handled pretty easily in SSIS and SS Import and Export Wizard, the python code provides the perfect starting point for SQL developers to use python.

Continue reading “When Python Meets SQL–part 1: move data between two SQL servers with fast load”

Pick up blogging again

Time really flies when I am having fun. I can’t believe that it has been more than one years since my last post. I had a lot of fun during this period of time: implementing python and PySpark in my daily work, picking up data science codes, learning graph, giving presentation at local SQL server user group meeting… … As a result, I haven’t get anything done with this blog.

Continue reading “Pick up blogging again”

Trick SQL Server to use the best Execution Plan

What is the two most important words that one should look for in SQL Server Execution plan?  Scan ! Seek!

SCAN a big table is a disaster. The rescue is: try your best to let SQL  Server SEEK a big table.

(William Wolf taught me this many many years ago at a SQL Saturday meeting. This knowledge is so powerful and it makes one of my query work so good that I want to share with you.)

Continue reading “Trick SQL Server to use the best Execution Plan”

Look at me! I made the fire!

In the movie Cast Away, the character (by Tom Hanks) danced and jumped around a bonfire on beach and shouted many times: “look at me! I made the fire!”, even though he knew that nobody could look at him at that isolated, far away island. He simply could not stop celebrating his exciting achievement. That is exactly same feeling on a late Friday afternoon, after I figure out how to solve two of my problems at same time. Those two problems are: first. let SSRS pass a parameter to a query in DB2, and second, that parameter could be one or several account_ID.

Continue reading “Look at me! I made the fire!”