Tuesday, 31 March 2015

Database design


     The Chancellor's Management Information System has been designed to meet a variety of information needs identified by the Chancellor's Office. The identification process has taken into account both past experience, current, and anticipated needs. The use of the information in the database will be focused on issue identification, policy analysis, program evaluation, and planning.

      Five basic design objectives underlying the Chancellor's database.
1. Accountability the ability to respond to basic questions about the students being served, their demographic characteristics, the courses and services they are utilizing, their goals, and the outcomes they achieve through the community college system.
2. Integration of data the ability to link together related data elements to serve the basic questions discussed above. For example, it should be possible to relate basic student demographic data to course activity data to answer questions about the race and gender of unduplicated headcount of students enrolled in various occupational programs.
3. Quality of data the ability to provide accurate and consistent information on a timely basis. Accuracy and consistency require editing and comparison of data in order to eliminate unreasonable information.
4. Longitudinal tracking the ability to follow the progress of students in meeting their goals, and evaluating institutional performance over time. Policy makers want the ability to measure the success of specific programs (e.g., matriculation) in achieving improved student performance. Such measurements require the ability to follow the performance of students over time.
5. Flexibility the ability to perform a variety of ad hoc analyses using a stable base of data. Emerging issues or changing circumstances should not necessarily be grounds for new data requirements being placed upon the districts.

The information system is designed to collect and organize information from three major areas of activities:
• Students;
• Faculty and staff;
• Courses
    The Chancellor seeks to collect that data that can provide answers to fundamental questions related to each of the three areas listed. It would be impossible to list every possible question related to the three areas. It is more useful to review the data elements documented in the Dictionary. A review of the Dictionary will reveal the basic nature of information being sought. The use of common linking data elements (Social Security number, TOP and ASA codes, district and college number, etc.) and the use of data base software will allow data to be linked from the various categories in order to respond to a variety of questions.
     As indicated, the Chancellor's Office Management Information System database is actually comprised of several databases that each represent a distinct, logical collection of data that are related together using common linking data elements and database software. The principal databases that represent the major areas of activity are the Student, Section, Course, and Employee databases. Each of these databases is logically broken down into smaller groups of data. For example, the Student database includes the Student Basic Demographic database, the Student DSP&S database, the Student Enrollment database and the Student Financial Aid database. Each database stores a record for every occurrence of the entity that the database represents. For example, the Student Basic Demographic database represents the entity identified as: STUDENTS. Each record in this database stores demographic data for all students enrolled at every college in the state. College and student identifiers are used to identify individual students (records) at each college. Stored in every record, in addition to identifiers, are data elements that describe the entity. For example, stored in all records of the Student Basic Demographic database are student's gender, ethnicity, birthdate, etc. The material on the following pages outlines the function of each database and each database sub-group. In addition, the manner in which the databases are related together and their associated data elements are described.
    Boxes and lines are used on the following pages to illustrate the databases and the relationships between them. There are three possible relationships that can occur either among two or more databases or database sub-groups. They are:
(1) one to one,
(2) one to many,
(3) many to many.
  An example of a "many to many" relationship exists between the Student and the Section databases (see following pages). For every student enrolled at a college during a given term, those students can be enrolled in one or more course sections. Likewise, every course section offered at the same college can have one or more students enrolled in it. The lines below are used on the following pages to denote each of the possible relationships that can occur between each database and between each database sub-group. The arrows show the possible relationships that can occur between databases within the reporting period for the databases being related together. For example, a reporting period can consist of a semester or quarter, or in the case of financial aid, an academic year.

A. Objectives for Data Element coding scheme:
• Ensure consistency in coding all data elements;
• Provide rules which will act as a set of principles to guide the designer when creating a new set of codes for a new data element;
• Allow the user to apply rules that are "easy" to remember when interpreting data generated from ad-hoc queries;
• Develop a reporting system that is based on Positive Reporting.
   Positive Reporting is founded on the principle of requiring a valid response for all data elements (i.e., blanks and spaces are not valid values). This is to ensure that no data element is omitted
B. Data Element Classifications:
The data elements defined in the Data Element Dictionary fall into three classes of data: Alphanumeric, Numeric, and Dates.
•Numeric Codes (0-9):
   This coding scheme implies an ordering or ranking of codes. This implied ordering or ranking can occur for two reasons. First, the items of a data element may have a logical order or rank associated with them. In addition, numeric codes are used where there is a preferred ordering or ranking of data items on the input source documents or the output terminal displays and paper reports.
•Alpha Codes (A-Z):
   There are three reasons for using alpha codes. First, if there are more than ten categories of data items and there are requirements to limit the size of the data element to one character, alpha codes may be used to accomplish this. Second, where it seems appropriate, alpha codes may be used to embed some intelligence into the data items. For example, data items for Student Gender (see SB04) are coded "F" for "Female" and "M" for "Male". And third, where it is desirable to code data items with no order or intelligence, alpha codes may be sequentially assigned for this purpose.
•Period (.):
  A period (.) as part of a code acts as a place holder. This coding scheme is used to follow the principle of Positive Reporting.

• Not to be confused with Numeric Codes (above), numeric data elements contain numeric values that will be used in some arithmetic calculations.
• Date data elements represent valid dates in the format YYMMDD, where YY = Year, MM = Month, and DD = Day. The birthdate has been changed to YYYYMMDD, where YYYY = Year, MM = Month, and DD = Day. The Chancellor’s Office will determine the century. The day must be within a range valid for that month. Also, leap year must be considered when reporting for February (i.e., 28 or 29 days).
C. Defaults:
There are circumstances or conditions that require the use of a default value when the coding of a data element is unknown or not applicable. (Note: Some data elements do not allow default values. Refer to the Data Element Dictionary to determine if a default value may be applied to a particular data element.)
The conditions that can occur in which default values are used.

Unknown and Unreported: This occurs when the value for a particular data element is unknown because it could not be captured from the data source (i.e., a student did not fill out a survey item). Thus, the value for the data element is unknown and unreported.
Not Applicable: This occurs when there are two data elements that can be reported for the same entity, and depending on their values, only one of those data elements is applicable, but not both. Data Elements that fall into this classification are said to be mutually exclusive of one another. That is, if a data element is reported so that it negates the application of another data element, they are said to be mutually exclusive.
    For example, if a student drops a course, then the Enrollment Drop Date is coded with the appropriate date. If the student does not drop the course, the Enrollment Drop Date has no meaning for that Student Enrollment. Therefore, the Enrollment Drop Date would be coded as "Not Applicable." (i.e., "888888" since it is a date field.)


1 comment:

  1. Very significant Information for us, I have think the representation of this Information is actually superb one. This is my first visit to your site. Cloud Document Authentication