ProcessTrends.Com
Home
Chart Gallery
Video Tutorials
Trend Analysis
Using R
Data Visualization
 Global Warming
Map Excel Data with Google Earth
Chart Doctor
Advanced Chart Techniques
Links
What's New
Downloads
  About      
 Search Site RSS
   04/4/2008 4:30 PM

 

Excel Data

Top
Top

Excel Database Overview 

Excel is not a  formal database, however, Excel has powerful database capabilities that can be used if the user properly sets-up the worksheet.   Workbook organizational pointers include: 

  • Keep data in separate worksheet; put reports and charts in separate worksheets

  • Do not leave blank rows or columns in dataset

  • Generally good idea to start an excel Database in cell A1

  • Label Columns in row 1 only; use word-wrap to keep column title in one cell.

Excel data lists are contiguous areas with titles in the top row of the data list. Pressing Control + Shift + 8 select the data list.

 

Top

Date and Time (DT) Formulas In Excel 

Date and time (DT) are critical variables for process data. Excel lets the user treat DT as one single data item.  Excel stores DT as a fraction of days since January 1, 1900. Note that time is treated as a fraction of a day. Mastery of Excel’s DT capabilities is critical to effective use of Excel for process data management.  

There are two key aspects of mastering DT: 

1.      DT Formatting 

2.      DT Formulas

Excel’s Date & Time Formats: 10/18/02 6:10 PM

Format Selection

Excel Display

Serial Format

37547.75694

MM/DD/YY HH:MM

10/18/02 18:10

MM/DD/YY HH:MM AM/PM

10/18/02 6:10 PM

MMM-YY

Oct-02

MMM-YYYY

October, 2002

MMMM

October

MMMM DD, YYYY

October 18, 2002

HH:MM:SS AM/PM

6:10 PM

D 18
DDD Fri
DDDD Friday

 DT Formatting  - There are numerous pre-defined DT formats that the user can select, in addition, the user can also create Custom Formats.

 

It is important to remember that Excel is storing one serial number for the DT , it can display the DT just about any way that a user wants.

DT Functions  - Excel has an impressive set of built in DT calculation functions that will return the day of week, day of month, day of year, hour, minute, second , month, week number or year of the DT in  cell. In addition, simple formulas can tell you if it is a leap year, the number of days in a month, the number of days between two dates or the remaining days in a year.

DT Built - In Functions and Formulas

Cell A1 = 10/18/02 6:10 PM

DT Parameter

Formula

Result

Year

=Year(A1)

2002

Month

=Month(A1)

10

Day of Month

=Day(A1)

18

Hour

=Hour(A1)

18

Minute

=Minute(A1)

10

Serial DT

=A1

37547.75694

Serial D

=Int(A1)

37547

Day of Year

=A1-Date(Year(A1),1,1)

291

Day of Week

=WeekDay(A1)

6

Week Number

=WeekNum(A1)

42

No. of Days in Month

=Day(Date(Year(A1),Month(A1)+1,1)-1)

31

No. Days between two Dates

=A2-A1

 

The number of days in month formula is useful in determining whether it is a leap year. The day of week function is very useful. Don’t enter the day yourself, let Excel calculate it based on the date. The Week number can be useful to summarize data by week. Similarly, the month function is useful to summarize data by month.  

 

Top

Range Names 

Range names are an important feature of Excel that helps with navigation, formula writing and checking, charting  and user documentation.

 In writing formulas, suppose you had flow data in cells B2:B8 and wanted to calculate the average flow. You could calculate the average by either of the two formulas below:

  •             =Sum(B2:B8)/7

  •             =Sum(Flow)/7

 The 2nd formula is easier to read and understand.

 Range names can be established in two steps:

  1. Highlight cell(s) to be included in range

  2. Place pointer in Name window and enter range name.

Range names can be placed in a formula by starting the formula with an “= “sign, entering part of the formula and pressing F3 at the place where you want the Range Name to be entered. All range names will be listed in the Paste Name Window.  Select the desired Range Name (flow in this example) and click OK.  Then complete the formula.

This saves typing and ensures that you use the correct spelling of the Range name.

Range name cell references are automatically updated when rows or columns are inserted or deleted.

Range names and range addresses can be clearly documented by using pressing F3 and pasting list of Ranges. This provides excellent Workbook  documentation.

 Users can name entire columns, making formula references very easy. Suppose you have a simple Excel list of dates in Column A and Flows in Column B. If you select column B (put cursor on B), then you can enter name in Name Box. Now the column can be referenced by “flow” in any formulas or look-ups.

 

Top

Excel Statistical Functions  

Excel has over 300 built in functions that do everything from calculate averages to maximums, standard deviations and the elapsed time between two dates.

You can access Excel’s built-in functions by pressing the function icon.

 

 

 

 

 

 

 

Excel's statistical analysis functions are described in the following table.
 
 

Statistical Function

Description (From Excel On-line Help File)

AVERAGE(A1:A5)

Returns the average (arithmetic mean) of the arguments.

COUNT(A1:A7)

Counts the number of cells that contain numbers and numbers within the list of arguments. Use COUNT to get the number of entries in a number field in a range or array of numbers.

COUNTA(A1:A7)

Counts the number of cells that are not empty and the values within the list of arguments. Use COUNTA to count the number of cells that contain data in a range or array.

COUNTIF(A1:A7,>0)

Counts the number of cells within a range that meet the given criteria.

COUNTBLANK(A1:A7)

Counts empty cells in a specified range of cells.

LARGE(A1:A7,2)

Returns the k-th largest value in a data set. You can use this function to select a value based on its relative standing. For example, you can use LARGE to return the highest, runner-up, or third-place score.

MAX(A1:A7)

Returns the largest value in a set of values.

MEDIAN(A1:A7)

Returns the median of the given numbers. The median is the number in the middle of a set of numbers; that is, half the numbers have values that are greater than the median, and half have values that are less.

MIN(A1:A7)

Returns the smallest number in a set of values.

PERCENTILE(A1:A7,0.3)

Returns the k-th percentile of values in a range. You can use this function to establish a threshold of acceptance. For example, you can decide to examine candidates who score above the 90th percentile.

PERCENTRANK(array,x,significance)

 

Returns the rank of a value in a data set as a percentage of the data set. This function can be used to evaluate the relative standing of a value within a data set. For example, you can use PERCENTRANK to evaluate the standing of an aptitude test score among all scores for the test.

QUARTILE(array,quart)

Returns the quartile of a data set. Quartiles often are used in sales and survey data to divide populations into groups. For example, you can use QUARTILE to find the top 25 percent of incomes in a population.

 

Top

If, And, Or 

The If function is a powerful Excel function that lets the user test conditions and return different values depending on the conditions.

 

The If function includes three arguments:

  1.  Logical test to be evaluated

  2. Value if logical test is true

  3. Value of logical test is false

  

The Interim Enhanced Surface Water Treatment Rules requires surface water treatment plants to monitor individual filter turbidity. The compliance criteria require that filter turbidity be less than 0.5 NTU 4 - hours after backwash. 

The sample spreadsheet to the right has been designed to track plant status, filter status change, the date and time of the last backwash filer status code and the date and time in 15-minute intervals. The worksheet automatically calculates the Filter Status code by comparing the time of the last backwash with the current date and time as well as plant status.  

The Filter Status Code for cells in Column E is established by the following nested If statement: 

 

 This formula has three nested If functions: 

  •  Is filter in backwash? If yes, enter “BW” in cell

  •  If no to 1, Is filter Off? If yes, enter “Off in cell.

  •  If no to 1 and 2, Is Date and Time < last backwash date and time + 4 hours. If yes, enter “On<4”, if no, enter “On>4”

 The And function can be combined with the If function to create complex logical tests where all arguments must be met for the logical test to be true. For example, if the user wants to test to see if cells C1,C2 and C3 contain A, B, and C respectively,  the user can use the following If (And) formula:

 =If(And(C1=”A”,C2=”B”,C3=”C”),”Yes, “No”)

 The Or function can be combined with the If function to create logical tests where either one condition of another condition must be met for a true outcome. In the previous example, suppose we want to have a true outcome if C1=”A”, or C2=”B” or C3=”C”. The following formula provides the correct outcome.

 =If(Or(C1=”A”,C2=”B”,C3=”C”),”Yes”, “No”)

 

Top

Formula Errors 

There are two common Excel formula errors: 

  • #Div/0! -  Division is zero in not mathematically correct. If the formula has a blank or zero in the divisor cell, then the #Div/0! error is returned.

  • #NA – This error value that means   "no value is available"

The division by zero and #N/A  errors can occur when a worksheet has a series of rows that are filled in over time. If there are rows that have formulas but no data, then either the #Div/0! or #N/A result can occur.

To avoid the  #Div/01 error display, the user can combine the If function and the IsBlank function to test for blank cell, as follows:

 =If(IsBlank(B3),” “, C3*8.345*D3)

 Another, more comprehensive approach, is to combine the If function with the ISERROR function, as follows:

 =If(ISERROR((C3*8.345*D3),” “, (C3*8.345*D3))

Top

Summarizing Data

Many compliance reports require the user to count the number of observations greater or less than a specific value or to calculate the sum of values for data within a specific time period, typically months or quarters. Suppose you have three columns of data, Dates in Column A, month of observation in Column B and Flow data in column C for an entire year. There are several summing and counting calculations that you can perform on that data set.

Fixed or Conditional

Calculation

Formula

Fixed

Sum flow values in specific range

= Sum(C2:C7

Count flow values in specific range

= Count(C2:C7)

Sum all flow values

= Sum(C:C)

Count all flow values

= Count(C:C)

Conditional

Sum February flow values

= SumIf(B:B, 2,C:C)

Count February flow values

= CountIf(B:B,2,C:C)

Count if February flow value < 1

= SumProducts(B2:B366 = 2)*(C2:C366 < 1)

 

 

 

 

 

Top

Importing Access Data Into Excel

 Users can import data from ACCESS by using MS Query found by clicking Data > Get External Data. The user defines the query the first time it is used; subsequent queries can be updated either automatically or with minor edits.

 MS Queries are established in 6 steps, as follows:

 Step 1: Activate MS Query by selecting Data  > Get External

 

 

 

 

Data > New Database Query

 

 Step 2: User must choose data source; in this example Access 97 database. Note that users may use either the Query Wizard or the full capabilities of MS Query. In the remainder of this example, the Query Wizard option is deselected and MS Query is used.

 

 

 

 

 
 

Step 3: MDS Query will connect to the data sources and display the available databases that are consistent with the type specified by the user. Make sure that you have indicated the proper file path.  Select the specific database that you wish to retrieve data from.

 

 

 

 

 

 

 

 

Step 4: MS Query will connect the specified database and present the data fields. At this point, the user selects the actual data fields that are to be retrieved by double clicking on the field names.

Step 5: The user now provides the criteria that are to be used to select the desired records. For process data, date and time range are likely criteria for record selection.

 Step 6: MS Query will display the retrieved data in a special MS Query window. The user may adjust the query, add-delete fields. Once the user is satisfied with the retrieval, the user should close the MS Query window and return to Excel’s normal window. MS Query will prompt the user to specify the cell where the retrieved data should be placed.

The user can rerun the Query at any time by activating the Data  Get External Data a Refresh Data. The user can also edit the Query to change the dates or other criteria by selecting Data   Get External Data a Edit Query

Top

Data Validation           

Data accuracy is critical. The Data Validation function is a powerful tool to help ensure good data quality. Data validation allows the workbook designer to restrict the type of data entered into specified cells and can provide instructions for users. The default setting is to allow any type of data in a cell. This can be restricted to allow only: 

  •  Whole numbers

  •  Decimal numbers

  •  Entries from a list

  •  Date

  •  Time

  • Text of specified length

  • Custom

 Data Validation is set up in 5 steps:

 

Select group of cells that you want to have the same data validation requirements.

From the main toolbar, select Data > DataValidation

Select type of data validation. The list option is selected in this example.

Enter data range for list. Note that Excel requires the list to be in same worksheet as the

data validation range. Use can get around this limitation by setting up your list as a range name.

Enter the range name in the Source box of the Data Validation window.

 

When you go to enter data in a list data validation range, Excel presents you the optional data items that you can select. If you try to enter a data item not on the list, Excel will prompt with an error message.