I have a table with this structure.
UserID | UserName | AnswerToQuestion1 | AnswerToQuestion2 | AnswerToQuestion3 1 | John | 1 | 0 | 1 2 | Mary | 1 | 1 | 0
I can't figure out what SQL query I would use to get a result set like this:
UserID | UserName | QuestionName | Response 1 | John | AnswerToQuestion1 | 1 1 | John | AnswerToQuestion2 | 0 1 | John | AnswerToQuestion3 | 1 2 | Mary | AnswerToQuestion1 | 1 2 | Mary | AnswerToQuestion2 | 1 2 | Mary | AnswerToQuestion3 | 0
I'm trying to split the three columns into three separate rows. Is this possible?
According to Itzik Ben-Gan in Inside Microsoft SQL Server 2008: T-SQL Querying, SQL Server goes through three steps when unpivoting a table:
- Generate copies
- Extract elements
- Remove rows with NULLs
Step 1: Generate copies
A virtual table is created that has a copy of each row in the orignal table for each column that is being unpivoted. Also, a character string of the column name is stored in a new column (call this the QuestionName column). *Note: I modified the value in one of your columns to NULL to show the full process.
UserID UserName AnswerTo1 AnswerToQ2 AnswerToQ3 QuestionName 1 John 1 0 1 AnswerToQuestion1 1 John 1 0 1 AnswerToQuestion2 1 John 1 0 1 AnswerToQuestion3 2 Mary 1 NULL 1 AnswerToQuestion1 2 Mary 1 NULL 1 AnswerToQuestion2 2 Mary 1 NULL 1 AnswerToQuestion3
Step 2: Extract elements
Then another table is created that creates a new row for each value from the source column which corresponds to the character string value in the QuestionName column. The value is stored in a new column (call this the Response column).
UserID UserName QuestionName Response 1 John AnswerToQuestion1 1 1 John AnswerToQuestion2 0 1 John AnswerToQuestion3 1 2 Mary AnswerToQuestion1 1 2 Mary AnswerToQuestion2 NULL 2 Mary AnswerToQuestion3 1
Step 3: Remove rows with NULLS
This step filters out any rows that were created with null values in the Response column. In other words, if any of the AnswerToQuestion columns had a null value, it would not be represented as an unpivoted row.
UserID UserName QuestionName Response 1 John AnswerToQuestion1 1 1 John AnswerToQuestion2 0 1 John AnswerToQuestion3 1 2 Mary AnswerToQuestion1 1 2 Mary AnswerToQuestion3 1
If you follow those steps, you can
An example below:
DECLARE @t1 TABLE (UserID INT, UserName VARCHAR(10), AnswerToQuestion1 INT, AnswertoQuestion2 INT, AnswerToQuestion3 INT ) INSERT @t1 SELECT 1, 'John', 1, 0, 1 UNION ALL SELECT 2, 'Mary', 1, NULL, 1 SELECT UserID, UserName, QuestionName, Response FROM ( SELECT UserID, UserName, QuestionName, CASE QuestionName WHEN 'AnswerToQuestion1' THEN AnswerToQuestion1 WHEN 'AnswerToQuestion2' THEN AnswertoQuestion2 ELSE AnswerToQuestion3 END AS Response FROM @t1 t1 CROSS JOIN ( SELECT 'AnswerToQuestion1' AS QuestionName UNION ALL SELECT 'AnswerToQuestion2' UNION ALL SELECT 'AnswerToQuestion3' ) t2 ) t3 WHERE Response IS NOT NULL
I had been working on Oracle long time back and has exposure to SQL. Now I need to brush up my SQL skills for a project on SQL Server 2008. Could you please recommend a good resource (online/book)?
Here are three great books, if you could buy only one and wanted a reference I'd recommend the first
"T-SQL Querying" book (http://www.amazon.com/Inside-Microsoft-Querying-Developer-Reference/dp/0735626030) has an interesting example, where, querying a table under default transaction isolation level during clustered index key column update, you may miss a row or read a row twice. It looks to be acceptable, since updating table/entity key is not a good idea anyway. However, I've updated this example so that the same happens, when you update non-clustered index key column value.
Following is the table structure:
SET NOCOUNT ON; USE master; IF DB_ID('TestIndexColUpdate') IS NULL CREATE DATABASE TestIndexColUpdate; GO USE TestIndexColUpdate; GO IF OBJECT_ID('dbo.Employees', 'U') IS NOT NULL DROP TABLE dbo.Employees; CREATE TABLE dbo.Employees ( empid CHAR(900) NOT NULL, -- this column should be big enough, so that 9 rows fit on 2 index pages salary MONEY NOT NULL, filler CHAR(1) NOT NULL DEFAULT('a') ); CREATE INDEX idx_salary ON dbo.Employees(salary) include (empid); -- include empid into index, so that test query reads from it ALTER TABLE dbo.Employees ADD CONSTRAINT PK_Employees PRIMARY KEY NONCLUSTERED(empid); INSERT INTO dbo.Employees(empid, salary) VALUES ('A', 1500.00),('B', 2000.00),('C', 3000.00),('D', 4000.00), ('E', 5000.00),('F', 6000.00),('G', 7000.00),('H', 8000.00), ('I', 9000.00);
This is what needs to be done in the first connection (on each update, the row will jump between 2 index pages):
SET NOCOUNT ON; USE TestIndexColUpdate; WHILE 1=1 BEGIN UPDATE dbo.Employees SET salary = 10800.00 - salary WHERE empid = 'I'; -- on each update, "I" employee jumps between 2 pages END
This is what needs to be done in the second connection:
SET NOCOUNT ON; USE TestIndexColUpdate; DECLARE @c INT WHILE 1 = 1 BEGIN SELECT salary, empid FROM dbo.Employees if @@ROWCOUNT <> 9 BREAK; END
Normally, this query should return 9 records we inserted in the first code sample. However, very soon, I see 8 records being returned. This query reads all it's data from the "idx_salary" index, which is being updated by previous sample code. This seems to be quite lax attitude towards data consistency from SQL Server. I would expect some locking coordination, when data is being read from index, while its key column is being updated.
Do I interpret this behavior correctly? Does this mean, that even non-clustered index keys should not be updated?
UPDATE: To solve this problem, you only need to enable "snapshots" on the db (READ_COMMITTED_SNAPSHOT ON). No more deadlocking or missing rows. I've tried summarize all of this here: http://blog.konstantins.net/2015/01/missing-rows-after-updating-sql-server.html
UPDATE 2: This seems to be the very same problem, as in this good old article: http://blog.codinghorror.com/deadlocked/
Do I interpret this behavior correctly?
Does this mean, that even non-clustered index keys should not be updated?
No. You should use a proper isolation level or make the application tolerate the inconsistencies that
READ COMMITTED allows.
This issue of missing rows is not limited to clustered indexes. It is caused by moving a row in a b-tree. Clustered and nonclustered indexes are implemented as b-trees with only tiny physical differences between them.
So you are seeing the exact same physical phenomenon. It applies every time your query reads a range of rows from a b-tree. The contents of that range can move around.
Use an isolation level that provides you the guarantees that you need. For read-only transactions the snapshot isolation level is usually a very elegant and total solution to concurrency. It seems to apply to your case.
This seems to be quite lax attitude towards data consistency from SQL Server. I would expect some locking coordination, when data is being read from index, while its key column is being updated.
This is an understandable request. On the other hand you specifically requested a low level of isolation. You can dial all the way up to
SERIALIZABLE of you want.
SERIALIZABLE presents you as-if serial execution.
Missing rows are just one special case of the many effects that
READ COMMITTED allows. It makes no sense to specifically prevent them while allowing all kinds of other inconsistencies.
SET NOCOUNT ON; USE TestIndexColUpdate; SET TRANSACTION ISOLATION LEVEL READ COMMITTED DECLARE @c INT WHILE 1 = 1 BEGIN DECLARE @count INT SELECT @count = COUNT(*) FROM dbo.Employees WITH (INDEX (idx_salary)) WHERE empid > '' AND CONVERT(NVARCHAR(MAX), empid) > '__' AND salary > 0 if @count <> 9 BREAK; END
I have a table that currently is a long list of devices and information about when they were sold. I need to take the table which would look something like this:
Item | Time Sold -------------------- A 05/2010 B 04/2010 C 03/2010 A 04/2010 A 05/2010
And then have a table with the item as the first column, with the count of the dates being the column headers, like below:
Item | Count_03/2010 | Count_04/2010 | Count_05/2010 ------------------------------------------------------------- A | 0 | 1 | 2 B | 0 | 1 | 0 C | 1 | 0 | 0
Is there a simple way of doing this? I know in other languages there is a single command to do it, was wondering if SQL had one.
My problem is that there is more than one table, and in some tables the months may be different than in other tables. Is there anyway to write a script that will apply to them all, by getting the variables listed and then using those to create the columns in the code? I could write one if I knew the months would always be the same, but since they are variable, is there a way of doing this.
SQL Server 2005 and up has something called
pivot, but it is still not one command and you need to know the values in the Time sold column. You can use a dynamic pivot approach as demonstrated by Itzik Ben-Gan in his Inside Microsoft SQL Server 2008: T-SQL Querying book
create table #test (Item char(1), TimeSold varchar(20)) insert #test values('A','05/2010') insert #test values('B','04/2010') insert #test values('C','03/2010') insert #test values('A','04/2010') insert #test values('A','05/2010') SELECT * FROM (SELECT Item,TimeSold FROM #test) AS pivTemp PIVOT ( COUNT(TimeSold) FOR TimeSold IN ([05/2010],[04/2010],[03/2010]) ) AS pivTable
I'm a developer with over 10 years experience in vb5/6, C#, VB.NET, 3 years of WCF too, design and architecture. I consider myself to be pretty good and can usually solve anything given time (like most good developers). I worked for a company for 8 years and during that time used a database called IBM UniVerse, it's an old Pick style database (multivalue), but it's ok. My trouble is, I've used a multivalue database for such a long time of my career I haven't got a lot of major advanced skills in SQL Server. I know enough about ANSI SQL to get by, but usually consult the net for syntax here and there, whilst in the 8 year job we automated SQL commands so I didn't even write SQL for at least a few years. Now I've left and I'm looking for another job and most of them say not only do you need to be an absolute expert at C#/VB.NET (which is fine) but you also need to be an expert at SQL Server and have the ability to write complex stored procs in T-SQL and performance tune. Now, I could probably get by just fine and learn as I go as I generally do anyway, but when it comes to write T-SQL on a stupid piece of paper in front of 4 people in an interview I don't do too well. I do like SQL Server as it has a management GUI unlike IBM UniVerse with it's DOS like command prompt! So using the management studio is fairly easy, setting indexes, creating tables, creating relationships etc.
So my question is, how to I go about getting up to speed very quickly in all things SQL Server? Most jobs now seems to be "Software Developer / DBA". In my opinion they are separate jobs, but oh well, can't do anything about that.
Buy a good book or two, and create a sample project to test your new skills.
I'd strongly suggest, in order:
I work with SQL almost on a daily basis and want to improve my SQL skills. I am looking for online resources or books for advanced queries for real life problems. I liked Joe Celko's books and I am looking for similar resources. I am also interested in SQL puzzles but ones which solve real life business apps. I am not into brain teasers, chess, games, cards and the like of SQL problems which look overly complicated or too obscure to be useful.
I want to learn to use SQL with fewer temp tables and more sophisticated joins, subqueries.. etc.
I checked out the resources from a similar SO question but I didn't find them very useful and was hoping there are others. Celko's books mention different approaches to the same problem and I find this pretty informative and eye opening. I want more! http://stackoverflow.com/questions/550545/any-recommended-websites-for-sql-puzzles
I have found this book to be a life-saver. This is targeted not only for beginners, but also advanced database professionals. I hope this helps you as well!
I loved the following book:
This question contains complete learning curve for Cpp from Beginner,*Intermediate* and advance which is very helpful.
Please suggest me same for Databases,to become a good Database Expert or a Good DBA. As there are may books and tutorials which to choose.
I found following books written by Itzik Ben-Gan very helpful:
Training Kit (Exam 70-461): Querying Microsoft SQL Server 2012 (MSPress, December 2012) http://www.amazon.com/Training-Kit-Exam-70-461-Microsoft/dp/0735666059
Inside Microsoft SQL Server 2008: T-SQL Programming (MSPress, 2009) http://www.amazon.com/Inside-Microsoft%C2%AE-Server%C2%AE-2008-Pro-Developer/dp/0735626022/ref=pd_sim_b_1?ie=UTF8&refRID=1ZFP1YGEB2QPZBZCBFZS
I'd say: learn the internals
From there you have a solid knowledge of querytuning, how the optmizer works and solid knowledge of how SQl Server works. Something every dba should know.