Friday, February 18, 2011

Introduction About SSIS


Introduction About SSIS:

SQL Server Integration Services (SSIS) is a component of the Microsoft SQL Server DB Software which can be used to perform a broad range of data migration tasks.

SSIS is a platform for data integration and workflow applications.It features a fast and flexible data warehousing tool used for data extraction, transformation, and loading (ETL). The tool may also be used to automate maintenance of SQL Server databases and updates to multidimensional cube data.

First released with Microsoft SQL Server 2005, SSIS replaced DTS (Data Transformation Services), which had been a feature of SQL Server since Version 7.0. Unlike DTS, which was included in all versions, SSIS is only available in the "Standard" and "Enterprise" editions.

Features:

The SSIS Import/Export Wizard lets the user create packages that move data from a single data source to a destination with no transformations. The Wizard can quickly move data from a variety of source types to a variety of destination types, including text files and other SQL Server instances.

Developers tasked with creating or maintaining SSIS packages use a visual development tool based on Microsoft Visual Studio called the SQL Server Business Intelligence Development Studio (BIDS).

It allows users to edit SSIS packages using a drag-and-drop user interface. A scripting environment in which to write programming code is also available in the tool. A package holds a variety of elements that define a workflow. Upon package execution, the tool provides color-coded, real-time monitoring.

Connections 

A connection includes the information necessary to connect to a particular data source. Tasks can reference the connection by its name, allowing the details of the connection to be changed or configured at runtime.

Tasks 

A task is an atomic work unit that performs some action. There are a couple of dozen tasks that ship in the box, ranging from the file system task (which can copy or move files) to the data transformation task. The data transformation task actually copies data; it implements the ETL features of the product.

Precedence constraints 

Tasks are linked by precedence constraints. The precedence constraint preceding a particular task must be met before that task executes. The runtime supports executing tasks in parallel if their precedence constraints so allow. Constraints may otherwise allow different paths of execution depending on the success or failure of other tasks. Together with the tasks, precedence constraints comprise the workflow of the package.

Event handlers 

A workflow can be designed for a number of events in the different scopes where they might occur. In this way, tasks may be executed in response to happenings within the package —such as cleaning up after errors.

Variables 

Tasks may reference variables to store results, make decisions, or affect their configuration.

- A package may be saved to a file or to a store with a hierarchical namespace within a SQL Server instance. In either case, the package content is persisted in XML.

- Once completed, the designer also allows the user to start the package's execution. Once started, the package may be readily debugged or monitored.

Features of the data flow task

SSIS provides the following built-in transformations:
  • Conditional Split
  • Multicast
  • Union-All, Merge, and Merge Join
  • Sort
  • Fuzzy Grouping
  • Lookup and Fuzzy Lookup
  • Percentage Sampling and Row Sampling
  • Copy/Map, Data Conversion, and Derived Column
  • Aggregation
  • Data Mining Model Training, Data Mining Query, Partition Processing, and Dimension Processing
  • Pivot and
  • Slowly Changing Dimension
  • Script Component
The Conditional Split transformation is used to speed up the query on the source table based on a particular condition. It is similar to the "if..else" construct in the C language.

Other included tools

Aside from the Import/Export Wizard and the designer, the product includes a few other notable tools.

DTEXEC executes a package from the command line wherever it may be stored. Before running the package, the tool may be instructed to apply configuration information, which will allow the same package to be reused with slightly different parameters, including different connection strings for its endpoints.

DTUTIL provides the ability to manage packages, again from the Command Prompt. The tool can copy or move a package from a file into the server store, or back out again. Among a few other sundry functions, it can be used to delete, rename, encrypt, or decrypt packages.

Extensibility and programmability

Users may write code to define their own connection objects, log providers, transforms, and tasks.

SSIS features a programmable object model that allows developers to write their own hosts for package execution. Such a host can respond to events, start and stop packages, and so on. The object model also allows developers to create, store, and load packages, as well as create, destroy, and modify any of the contained objects.

It can be used on all versions of SQL Server 2005 except Express and Workgroup.

Monday, February 14, 2011

Are You Master in SQL Server DB...?

 If you are able to answer the following basic questions,  You are the Master in SQL Server

Database Concepts

· What is database or database management systems (DBMS)?
· What is difference between DBMS and RDBMS?
· What are CODD rules?
· Is access database a RDBMS?
· What is the main difference between ACCESS and SQL SERVER?
· What is the difference between MSDE and SQL SERVER 2000?
· What is SQL SERVER Express 2005 Edition?
· What is SQL Server 2000 Workload Governor?
· What is the difference between SQL SERVER 2000 and 2005?
· What are E-R diagrams?
· How many types of relationship exist in database designing?
· What is normalization? What are different types of normalization?
· What is denormalization?
· Can you explain Fourth Normal Form?
· Can you explain Fifth Normal Form?
· Have you heard about sixth normal form?
· What is Extent and Page?
· What are the different sections in Page?
· What are page splits?
· In which files does actually SQL Server store data?
· What is Collation in SQL Server?
· Can we have a different collation for database and table?

SQL

· Revisiting basic syntax of SQL?
· What are “GRANT” and “REVOKE’ statements?
· What is Cascade and Restrict in DROP table SQL?
· How to import table using “INSERT” statement?
· What is a DDL, DML and DCL concept in RDBMS world?
· What are different types of joins in SQL?
· What is “CROSS JOIN”?
· You want to select the first record in a given set of rows?
· How do you sort in SQL?
· How do you select unique rows using SQL?
· Can you name some aggregate function is SQL Server?
· What is the default “SORT” order for a SQL?
· What are Wildcard operators in SQL Server?
· What is the difference between “UNION” and “UNION ALL”?
· What are cursors and what are the situations you will use them?
· What are the steps to create a cursor?
· What is a self-join?
· What are the different Cursor Types?
· What are “Global” and “Local” cursors?
· What is “Group by” clause?
· What is ROLLUP?
· What is the difference between DELETE and TRUNCATE
· What is CUBE?
· What is the difference between “HAVING” and “WHERE” clause?
· What is “COMPUTE” clause in SQL?
· What is “WITH TIES” clause in SQL?
· What does “SET ROWCOUNT” syntax achieves?
· What is a Sub-Query?
· What is “Correlated Subqueries”?
· What is “ALL” and “ANY” operator? or
· What is a “CASE” statement in SQL? or
· What does COLLATE Keyword in SQL signify? or
· What is CTE (Common Table Expression)?
· Select addresses which are between ‘1/1/2004’ and ‘1/4/2004’?
· What is TRY/CATCH block in T-SQL?
· What is UNPIVOT?
· What are RANKING functions?
· Why should you use CTE rather than simple views?
· What is RANK ()?
· What is ROW_NUMBER()?
· What is DENSE_RANK()?
· What is NTILE()?
· What is SQl injection?
· What (is PIVOT feature in SQL Server?

.NET Integration

· What are steps to load a .NET code in SQL SERVER 2005?
· How can we drop an assembly from SQL SERVER?
· Are changes made to assembly updated automatically in database?
· Why do we need to drop assembly for updating changes?
· How to see assemblies loaded in SQL Server?
· I want to see which files are linked with which assemblies?
· Does .NET CLR and SQL SERVER run in different process?
· Does .NET controls SQL SERVER or is it vice-versa?
· Is SQLCLR configured by default?
· How to configure CLR for SQL SERVER?
· How does SQL Server control .NET run-time?
· In previous versions of .NET it was done via COM interface
“ICorRuntimeHost”.
·
· What is a “SAND BOX” in SQL Server 2005?
· What is an application domain?
· How is .NET Appdomain allocated in SQL SERVER 2005?
· What is Syntax for creating a new assembly in SQL Server 2005?
· Do Assemblies loaded in database need actual .NET DLL?
· You have an assembly, which is dependent on other assemblies; will SQL Server
load the dependent assemblies?
· Does SQL Server handle unmanaged resources?
· What is Multi- tasking?
· What is Multi-threading?
· What is a Thread?
· Can we have multiple threads in one App domain?
· What is Non-preemptive threading?
· What is pre-emptive threading?
· Can you explain threading model in SQL Server?
· How does .NET and SQL Server thread work?
· How is exception in SQLCLR code handled?
· Are all .NET libraries allowed in SQL Server?
· How many types of permission level are there for an assembly?
· In order that an assembly gets loaded in SQL Server what type of checks are
done?
· Can you name system tables for .NET assemblies?
· Are two version of same assembly allowed in SQL Server?
· How are changes made in assembly replicated?
· Is it a good practice to drop a assembly for changes?
· In one of the projects following steps where done, will it work?
· What does Alter assembly with unchecked data signify?
· How do I drop an assembly?
· Can we create SQLCLR using .NET framework 1.0?
· While creating .NET UDF what checks should be done
· How do you define a function from the .NET assembly?
· Can you compare between T-SQL and SQLCLR?
· With respect to .NET is SQL SERVER case sensitive?
· Does case sensitive rule apply for VB.NET?
· Can nested classes be accessed in T-SQL?
· Can we have SQLCLR procedure input as array?
· Can object data type be used in SQLCLR?
· How is precision handled for decimal data types in .NET?
· How do we define INPUT and OUTPUT parameters in SQLCLR?
· Is it good to use .NET data types in SQLCLR?
· How to move values from SQL to .NET data types?
· What is System.Data.SqlServer?
· What is SQLContext?
· Can you explain essential steps to deploy SQLCLR?
· How do create function in SQL Server using .NET?
· How do we create trigger using .NET?
· How to create User Define Functions using .NET?
· How to create aggregates using .NET?
· What is Asynchronous support in ADO.NET?
· What is MARS support in ADO.NET?
· What is SQLbulkcopy object in ADO.NET?
· How to select range of rows using ADO.NET?
· What are different types of triggers in SQl SERVER 2000?
· If we have multiple AFTER Triggers on table how can we define the sequence of
the triggers.
· How can you raise custom errors from stored procedure?

ADO.NET

· Which are namespaces for ADO.NET?
·
· Can you give a overview of ADO.NET architecture?
· What are the two fundamental objects in ADO.NET?
· What is difference between dataset and data reader?
· What are major difference between classic ADO and ADO.NET?
· What is the use of connection object?
· What are the methods provided by the command object?
· What is the use of “Data adapter”?
· What are basic methods of “Data adapter”?
· What is Dataset object?
· What are the various objects in Dataset?
· How can we connect to Microsoft Access, FoxPro, Oracle etc?
· What is the namespace to connect to SQL Server?
· How do we use stored procedure in ADO.NET?
· How can we force the connection object to close?
· Can we optimize command object when there is only one row?
· Which is the best place to store connection string?
· What are steps involved to fill a dataset?
· What are the methods provided by the dataset for XML?
· How can we save all data from dataset?
· How can we check for changes made to dataset?
· How can we add/remove row is in “DataTable” object of “Dataset”?
· What is basic use of “Data View”?
· What is difference between “Dataset” and “Data Reader”?
· How can we load multiple tables in a Dataset?
· How can we add relation’ s between table in a Dataset?
· What is the use of Command Builder?
· What is difference between “Optimistic” and “Pessimistic” locking?
· How many way’s are there to implement locking in ADO.NET?
· How can we perform transactions in .NET?
· What is difference between Dataset? Clone and Dataset. Copy?
· What’s the difference between Dataset and ADO Record set?

Notification Services

· What are notification services?
· What are basic components of Notification services?
· Can you explain architecture of Notification Services?
· Which are the two XML files needed for notification services?
· What is Nscontrols command?
· What are the situations you will use “Notification” Services?

Service Broker

· What do we need Queues?
· What is “Asynchronous” communication?
· What is SQL Server Service broker?
· What are the essential components of SQL Server Service broker?
· What is the main purpose of having Conversation Group?
· How to implement Service Broker?
· How do we encrypt data between Dialogs?

XML Integration

· What is XML?
· What is the version information in XML?
· What is ROOT element in XML?
· If XML does not have closing tag will it work?
· Is XML case sensitive?
· What is the difference between XML and HTML?
· Is XML meant to replace HTML?
· Can you explain why your project needed XML?
· What is DTD (Document Type definition)?
· What is well formed XML?
· What is a valid XML?
· What is CDATA section in XML?
· What is CSS?
· What is XSL?
· What is Element and attributes in XML?
· Can we define a column as XML?
· How do we specify the XML data type as typed or untyped?
· How can we create the XSD schema?
· How do I insert in to a table that has XSD schema attached to it?
· What is maximum size for XML data type?
· What is Xquery?
· What are XML indexes?
· What are secondary XML indexes?
· What is FOR XML in SQL Server?
· Can I use FOR XML to generate SCHEMA of a table and how?
· What is the OPENXML statement in SQL Server?
· I have huge XML file, which we want to load in database?
· How to call stored procedure using HTTP SOAP?
· What is XMLA?

Data Warehousing / Data Mining

· What is “Data Warehousing”?
· What are Data Marts?
· What are Fact tables and Dimension Tables?
· What is Snow Flake Schema design in database?
· What is ETL process in Data warehousing?
· How can we do ETL process in SQL Server?
· What is “Data mining”?
· Compare “Data mining” and “Data Warehousing”?
· (What is BCP?
· How can we import and export using BCP utility?
· During BCP we need to change the field position or eliminate some fields how
can we achieve this?
· What is Bulk Insert?
· What is DTS?
· Can you brief about the Data warehouse project you worked on?
· What is an OLTP (Online Transaction Processing) System?
· What is an OLAP (On- line Analytical processing) system?
· What is Conceptual, Logical and Physical model?
· What is Data purging?
· What is Analysis Services?
· What are CUBES?
· What are the primary ways to store data in OLAP?
· What is META DATA information in Data warehousing projects?
· What is multi-dimensional analysis?
· What is MDX?
· How did you plan your Data warehouse project?
· What are different deliverables according to phases?
· Can you explain how analysis service works?
· What are the different problems that “Data mining” can solve?
· What are different stages of “Data mining”?
· What is Discrete and Continuous data in Data mining world?
· What is MODEL is Data mining world?
· How are models actually derived?
· What is a Decision Tree Algorithm?
· Can decision tree be implemented using SQL?
· What is Naïve Bayes Algorithm?
· Explain clustering algorithm?
· Explain in detail Neural Networks?
· What is Back propagation in Neural Networks?
· What is Time Series algorithm in data mining?
· Explain Association algorithm in Data mining?
· What is Sequence clustering algorithm?
· What are algorithms provided by Microsoft in SQL Server?
· How does data mining and data warehousing work together?
· What is XMLA?
· What is Discover and Execute in XMLA?

Integration Services / DTS

· What is Integration Services import / export wizard?
· What are prime components in Integration Services?
· How can we develop a DTS project in Integration Services?

Replication

· Whats the best way to update data between SQL Servers?
· What are the scenarios you will need multiple databases with schema?
· How will you plan your replication?
· What are publisher, distributor and subscriber in “Replication”?
· What is “Push” and “Pull” subscription?
· Can a publication support push and pull at one time?
· What are different models / types of replication?
· What is Snapshot replication?
· What are the advantages and disadvantages of using Snapshot replication?
· What type of data will qualify for “Snapshot replication”?
· What is the actual location where the distributor runs?
· Can you explain in detail how exactly “Snapshot Replication” works?
· What is merge replication?
· How does merge replication works?
· What are advantages and disadvantages of Merge replication?
· What is conflict resolution in Merge replication?
· What is a transactional replication?
· Can you explain in detail how transactional replication works?
· What are data type concerns during replications?

Reporting Services

· Can you explain how can we make a simple report in reporting services?
· How do I specify stored procedures in Reporting Services?
· What is the architecture for “Reporting Services “?

Database Optimization

· What are indexes?
· What are B-Trees?
· I have a table which has lot of inserts, is it a good database design to create
indexes on that table?
· What are “Table Scan’s” and “Index Scan’s”?
· What are the two types of indexes and explain them in detail?
· What is “FillFactor” concept in indexes?
· What is the best value for “FillFactor”?
· What are “Index statistics”?
· How can we see statistics of an index?
· How do you reorganize your index, once you find the problem?
· What is Fragmentation?
· How can we measure Fragmentation?
· How can we remove the Fragmented spaces?
· What are the criteria you will look in to while selecting an index?
· What is “Index Tuning Wizard”?
· What is an Execution plan?
· How do you see the SQL plan in textual format?
· What is Nested join, Hash join and Merge join in SQL Query plan?
· What joins are good in what situations?
· What is RAID and how does it work?

Transaction and Locks

· What is a “Database Transactions “?
· What is ACID?
· What is “Begin Trans”, “Commit Tran”, “Rollback Tran” and “Save Tran”?
· What are “Checkpoint’s” in SQL Server?
· What are “Implicit Transactions”?
· Is it good to use “Implicit Transactions”?
· What is Concurrency?
· How can we solve concurrency problems?
· What kind of problems occurs if we do not implement proper locking strategy?
· What are “Dirty reads”?
· What are “Unrepeatable reads”?
· What are “Phantom rows”?
· What are “Lost Updates”?
· What are different levels of granularity of locking resources?
· What are different types of Locks in SQL Server?
· What are different Isolation levels in SQL Server?
· What are different types of Isolation levels in SQL Server?
· If you are using COM+, what “Isolation” level is set by default?
· What are “Lock” hints?
· What is a “Deadlock”?
· What are the steps you can take to avoid “Deadlocks”?
· How can I know what locks are running on which resource?
·