Tuesday, December 19, 2023

AWS - Build REST APIs on API Gateway and Lambda and DynamoDB

 

We break down business functions into small services, provide them through endpoints, so that they are consumed by other applications across the enterprise or from third parties depending on the use cases. Microservices advocate to loosely integrate applications with high fault tolerance as well as great agility of development. With cloud technologies, AWS extends to serverless implementation that helps customers focus on the business logic part without managing servers. Events, a change in state or an update, are produced, ingested, and routed among the serverless services. This is the core concept of Event-Driven Architecture. 

In this post, we are going to build REST APIs using a part of those cloud services. The business logic is implemented with Lambda, the data is stored in DynamoDB, and the APIs are deployed on API Gateway.



Regional Endpoint Type

Note: "Traffic that is in an Availability Zone, or between Availability Zones in all Regions, routes over the AWS private global network."

We are about to create a REST API to create a table in DynamoDB.

Table:                 Music
Partition Key:    Artist
Sort Key:           SongTitle
Attribute:           AlbumTitle

Let’s start with creating the backend Lambda function.



Create a Backend Lambda Function

 

Sign in the Management Console, open Lambda Console, click Functions on the left navigation pane. A list of functions in the Region associated with your account will be presented on the page.

 Click Create Function.



Choose Author from scratch.

Under Basic information, enter DemoCreateTable for Function name, choose Python 3.12 for Runtime



For Permissions, check Use an existing role, select TestRoleLambda for Existing role.

TestRoleLambda is a role created with AmazonDynamoDBFullAccess policy attached.

Click Create function.



The function is created, and we can view its details.



Click Code tab. You’ll see the default function which returns a message, “Hello from Lambda!”.



Replace the default function with the following source codes.

Lambda: DemoCreateTable

import json
import boto3 as bo
import botocore as bc

def lambda_handler(event, context):

    if event['headers'] is not None:
        dictparam = event['headers']
    elif event['queryStringParameters'] is not None:
        dictparam = event['queryStringParameters']
    elif event['body'] is not None:
        dictparam = json.loads(event['body'])
    else:
        return {
            'statusCode': 400, 
            'body': json.dumps('Name of the table to be created is not specified.')
        }

    try:
        tablename = dictparam['table']
        client = bo.client('dynamodb')
    
        response = client.create_table(
            AttributeDefinitions=[
                {
                    'AttributeName': 'Artist',
                    'AttributeType': 'S',
                },
                {
                    'AttributeName': 'SongTitle',
                    'AttributeType': 'S',
                },
            ],
            KeySchema=[
                {
                    'AttributeName': 'Artist',
                    'KeyType': 'HASH',
                },
                {
                    'AttributeName': 'SongTitle',
                    'KeyType': 'RANGE',
                },
            ],
            ProvisionedThroughput={
                'ReadCapacityUnits': 5,
                'WriteCapacityUnits': 5,
            },
            TableName= tablename,
        )
        
        code = 200
        msg = 'Table created'
    except bc.exceptions.ClientError as e:
        code = 500
        msg = str(e)
    except KeyError as e:
        code = 400
        msg = 'KeyError exception happened while using key {} to get the table name.'.format(str(e))

    return { 
        'statusCode': code, 
        'body': json.dumps(msg)
    }



Click Deploy




How to Test the Lambda Function?


Go to Test tab, choose Create new event, enter DemoEventCreateTable for Event name, enter {“headers”:{“table”:”Music”}} for Event JSON, and click Save



Go back to Code tab, click Test. You can check the execution results. The response says the table has been created.



After we created the Lambda function, it is time to create the REST API now.

Navigate to the API Gateway Console, click APIs on the left pane. You’ll see a list of currently available APIs.

Click Create API.



Choose REST API.



Choose New API.

Enter DemoRESTAPI for API name.

Choose Regional for API endpoint type.

Click Create API.



The Console navigates to the Resources page.

Click Create resource.



Enter DemoResourceMusic for Resource name.

Click Create resource.



DemoResourceMusic was created immediately under root “/”. We are going to create one more resource under DemoResourceMusic.

Click Create resource again.



Enter DemoResourceMusicTable for Resource name.

Click Create resource.



You can find the newly created resource, DemoResouceMusicTable, appearing in the Resources tree. We are going to create a PUT method for it.

Click Create method.



Choose PUT for Method type.

Choose Lambda function for Integration type.

Check Lambda proxy integration.



For Lambda function, choose DemoCreateTable created in the previous step.

Click Create method.



A success message pops up on the top of the page.




Test the API


We can run a test right now. How? Go to Test tab, enter “table:Music” in the Headers box, and click Test on the bottom of the page.



On the same page, we can view the results as shown on the screenshot below. We got the message, “Table created”, generated by the backend Lambda function. 



To make the API available to the consumers, we’ll need to deploy it first.

Click Deploy API.



Choose New stage for Stage.

Enter Test for Stage name.

Click Deploy.



So, the deployment was created and active for test now.



We are going to dump logs to CloudWatch to trace the API’s execution.

Click Logs and tracing.



Choose Full request and response logs for CloudWatch logs.

Enable Custom access logging.

Create a Log group named by DemoRESTAPI in the CloudWatch Console and paste its ARN in the field of Access log destination ARN.

For Log format, get the JSON template from “Learn more” link.

Click Save changes.



We also need to set up an IAM role for outputting logs to CloudWatch.

Navigate to Settings of the API, go to Logging section, and click Edit.

Choose TestRoleApiGateway, a predefined role assigned with policy AmazonAPIGatewayPushToCloudWatchLogs.



After all these are done, we can test the REST API via CloudShell using curl command.

curl -X PUT https://ixun012ycl.execute-api.us-west-2.amazonaws.com/Test/DemoResourceMusic/DemoResourceMusicTable -H 'content-type: application/json' -H 'table:Music'

We got the table created message. It works well.




Logs


Let’s check the logs and get an insight into what actually happened inside the API.

Open CloudWatch Console, click Log groups on the left pane, and we can identify the following log groups associated with the REST API.

- API-Gateway-Exection-Logs_ixun012ycl/Test

- DemoRESTAPI



The below are examples of them, respectively.



Execution Log: API-Gateway-Exection-Logs_ixun012ycl/Test


Access Log: DemoRESTAPI


What Is Passed to the Backend Lambda? 


This varies depending on enabling or disabling Lambda proxy integration setting.

In case of enabling Lambda proxy integration, the Request with the full-scale contents is handed to the backend function, as shown below.

{
    'resource': '/DemoResourceMusic/DemoResourceMusicTable',
    'path': '/DemoResourceMusic/DemoResourceMusicTable',
    'httpMethod': 'PUT',
    'headers': {'table': 'Music'},
    'multiValueHeaders': {'table': ['Music']},
    'queryStringParameters': None,
    'multiValueQueryStringParameters': None,
    'pathParameters': None,
    'stageVariables': None,
    'requestContext': {
        'resourceId': 'fby8li',
        'resourcePath': '/DemoResourceMusic/DemoResourceMusicTable',
        'httpMethod': 'PUT',
        'extendedRequestId': 'QBkk2FGJvHcFt3g=',
        'requestTime': '16/Dec/2023:07:13:34 +0000',
        'path': '/DemoResourceMusic/DemoResourceMusicTable',
        'accountId': 'nnnnnnnnnnnn',
        'protocol': 'HTTP/1.1',
        'stage': 'test-invoke-stage',
        'domainPrefix': 'testPrefix',
        'requestTimeEpoch': 1702710814759,
        'requestId': 'ab5bfc95-cf93-4eac-8357-e4a1f75f8585',
        'identity': {
            'cognitoIdentityPoolId': None,
            'cognitoIdentityId': None,
            'apiKey': 'test-invoke-api-key',
            'principalOrgId': None,
            'cognitoAuthenticationType': None,
            'userArn': 'arn:aws:iam::nnnnnnnnnnnn:user/TestIAMUser',
            'apiKeyId': 'test-invoke-api-key-id',
            'userAgent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
            'accountId': 'nnnnnnnnnnnn',
            'caller': 'AIDAQ3G4CFOKKZWQESTX4',
            'sourceIp': 'test-invoke-source-ip',
            'accessKey': 'ASIAQ3G4CFOKGRULRTC2',
            'cognitoAuthenticationProvider': None,
            'user': 'AIDAQ3G4CFOKKZWQESTX4'
        },
        'domainName': 'testPrefix.testDomainName',
        'apiId': 'ixun012ycl'
    },
    'body': None,
    'isBase64Encoded': False
}
 

On the contrary, only the body of the Request is passed to the backend function if Lambda proxy integration setting is disabled.

Of course, the Responses are also slightly different. The Lambda function returns status code and body.

return {
    'statusCode': code,
    'body': json.dumps(msg)
}
 

With Lambda proxy integration, the code is set as the Response’s Status Code and the message is set as its body.


Status
200
Response body
"Table created"
Response headers
{
  "X-Amzn-Trace-Id": "Root=1-657d49e2-f5f6fd248da8e0881dd97df6;Sampled=0;lineage=cdebf0f4:0"
}


Without Lambda proxy integration, the code and the message are combined and set as the Response’s body.


Status
200
Response body
{"statusCode": 200, "body": "\"Table created\""}
Response headers
{
  "Content-Type": "application/json",
  "X-Amzn-Trace-Id": "Root=1-65801497-1347f5f3c51608798c637128;Sampled=0;lineage=cdebf0f4:0"
}


 

Access Control


To manage access to a REST API, API Gateway supports several mechanisms in place, please refer to Amazon API Gateway Developer Guide for more information.

For Resource policy, you can define it in the API Gateway Console. A Resource policy is stated in the IAM policy language, here is a standard template of granting access to a list of source VPCs.

 
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Deny",
            "Principal": "*",
            "Action": "execute-api:Invoke",
            "Resource": "execute-api:/{{stageNameOrWildcard}}/{{httpVerbOrWildcard}}/{{resourcePathOrWildcard}}",
            "Condition": {
                "StringNotEquals": {
                    "aws:sourceVpc": "{{vpcID}}"
                }
            }
        },
        {
            "Effect": "Allow",
            "Principal": "*",
            "Action": "execute-api:Invoke",
            "Resource": "execute-api:/{{stageNameOrWildcard}}/{{httpVerbOrWildcard}}/{{resourcePathOrWildcard}}"
        }
    ]
}

 

More APIs


Following the procedures stated above, we can build more methods, more APIs. For example, we create another method, DELETE, for the resource DemoResourceMusicTable.



 Lambda: DemoDeleteTable

import json
import boto3 as bo
import botocore as bc

def lambda_handler(event, context):

    if event['headers'] is not None:
        dictparam = event['headers']
    elif event['queryStringParameters'] is not None:
        dictparam = event['queryStringParameters']
    elif event['body'] is not None:
        dictparam = json.loads(event['body'])
    else:
        return {
            'statusCode': 400, 
            'body': json.dumps('Name of the table to be deleted is not specified.')
        }

    try:
        tablename = dictparam['table']
        client = bo.client('dynamodb')
        
        response = client.delete_table(
            TableName = tablename,
        )
                
        code = 200
        msg = 'Table deleted'
    except bc.exceptions.ClientError as e:
        code = 500
        msg = str(e)
    except KeyError as e:
        code = 400
        msg = 'KeyError exception happened while using key {} to get the table name.'.format(str(e))
        
    return { 
        'statusCode': code, 
        'body': json.dumps(msg)
    }    
  

For the same resource, we don’t have to create a separate Lambda function for each HTTP method, indeed, we can create one Lambda function, and use httpMethod to separate processes for each method.

For DemoResourceMusicItem resource, a consolidated function is prepared for the methods.



Lambda: DemoHandleItem

import json
import boto3 as bo
import botocore as bc

def lambda_handler(event, context):

    if event['headers'] is not None:
        dictparam = event['headers']
    elif event['queryStringParameters'] is not None:
        dictparam = event['queryStringParameters']
    elif event['body'] is not None:
        dictparam = json.loads(event['body'])
    else:
        return {
            'statusCode': 400, 
            'body': json.dumps('Item to be processed is not specified.')
        }

    #
    # Add an item
    if event['httpMethod'] == 'PUT':
        
        try:
            tablename = dictparam['table']
            artist = dictparam['artist']
            songtitle = dictparam['songtitle']
            albumtitle = dictparam['albumtitle']
            
            client = bo.client('dynamodb')
            response = client.put_item(
                Item={
                    'Artist': {
                        'S': artist,
                    },
                    'AlbumTitle': {
                        'S': albumtitle,
                    },
                    'SongTitle': {
                        'S': songtitle,
                    },
                },
                ReturnConsumedCapacity='TOTAL',
                TableName = tablename,
            )
            
            code = 200
            msg = 'Item added'
        except bc.exceptions.ClientError as e:
            code = 500
            msg = str(e)
        except KeyError as e:
            code = 400
            msg = 'KeyError exception happened while using key {} to get the value.'.format(str(e))
    
        return {
            'statusCode': code,
            'body': json.dumps(msg)
        }
    #
    # Delete an item
    elif event['httpMethod'] == 'DELETE':
        try:
            tablename = dictparam['table']
            artist = dictparam['artist']
            songtitle = dictparam['songtitle']
            
            client = bo.client('dynamodb')
            response = client.delete_item(
                Key={
                    'Artist': {
                        'S': artist,
                    },
                    'SongTitle': {
                        'S': songtitle,
                    },
                },
                TableName = tablename,
            )
                    
            code = 200
            msg = 'Item deleted'
    
        except bc.exceptions.ClientError as e:
            code = 500
            msg = str(e)
        except KeyError as e:
            code = 400
            msg = 'KeyError exception happened while using key {} to get the value.'.format(str(e))
    
        return {
            'statusCode': code,
            'body': json.dumps(msg)
        }
    #
    # Select an item
    elif event['httpMethod'] == 'GET':
        try:
            tablename = dictparam['table']
            artist = dictparam['artist']
            songtitle = dictparam['songtitle']
            
            client = bo.client('dynamodb')
            response = client.get_item(
                Key={
                    'Artist': {
                        'S': artist,
                    },
                    'SongTitle': {
                        'S': songtitle,
                    },
                },
                TableName = tablename,
            )
                    
            code = 200
            if 'Item' in response.keys():
                msg = response['Item']
            else:
                msg = 'Item not found'
                
        except bc.exceptions.ClientError as e:
            code = 500
            msg = str(e)
        except KeyError as e:
            code = 400
            msg = 'KeyError exception happened while using key {} to get the value.'.format(str(e))
    
        return {
            'statusCode': code,
            'body': json.dumps(msg)
        }
    #
    # Update an item
    elif event['httpMethod'] == 'POST':
        try:
            tablename = dictparam['table']
            artist = dictparam['artist']
            songtitle = dictparam['songtitle']
            albumtitle = dictparam['albumtitle']
            
            client = bo.client('dynamodb')
            response = client.update_item(
                ExpressionAttributeNames={
                    '#AT': 'AlbumTitle',
                },
                ExpressionAttributeValues={
                    ':t': {
                        'S': albumtitle,
                    },
                },
                Key={
                    'Artist': {
                        'S': artist,
                    },
                    'SongTitle': {
                        'S': songtitle,
                    },
                },
                ReturnValues = 'ALL_NEW',
                TableName = tablename,
                UpdateExpression='SET #AT = :t',
            )
                    
            code = 200
            msg = 'Item updated'
    
        except bc.exceptions.ClientError as e:
            code = 500
            msg = str(e)
        except KeyError as e:
            code = 400
            msg = 'KeyError exception happened while using key {} to get the value.'.format(str(e))
    
        return {
            'statusCode': code,
            'body': json.dumps(msg)
        }
    #
    # Undefined request
    else:
        return {
            'statusCode': 400, 
            'body': json.dumps('Undefined request.')
        }

Let’s run a test using the following commands. 



Delete a table:

curl -X DELETE https://ixun012ycl.execute-api.us-west-2.amazonaws.com/Test/DemoResourceMusic/DemoResourceMusicTable -H 'content-type: application/json' -H 'table:Music'

Put an item:

curl -X PUT https://ixun012ycl.execute-api.us-west-2.amazonaws.com/Test/DemoResourceMusic/DemoResourceMusicItem -H 'content-type: application/json' -H 'table:Music' -H 'artist:No One You Know' -H 'albumtitle:Somewhat Famous' -H 'songtitle:Call Me Today'

Update an item:

curl -X POST https://ixun012ycl.execute-api.us-west-2.amazonaws.com/Test/DemoResourceMusic/DemoResourceMusicItem -H 'content-type: application/json' -H 'table:Music' -H 'artist:No One You Know' -H 'albumtitle: Louder Than Ever' -H 'songtitle:Call Me Today'

Get an item:

curl -X GET https://ixun012ycl.execute-api.us-west-2.amazonaws.com/Test/DemoResourceMusic/DemoResourceMusicItem -H 'content-type: application/json' -H 'table:Music' -H 'artist:No One You Know' -H 'songtitle:Call Me Today'

Delete an item:

curl -X DELETE https://ixun012ycl.execute-api.us-west-2.amazonaws.com/Test/DemoResourceMusic/DemoResourceMusicItem -H 'content-type: application/json' -H 'table:Music' -H 'artist:No One You Know' -H 'songtitle:Call Me Today'

 

Friday, September 1, 2023

Machine Learning - Build and Compare Regression Models


This is a continued blog following Build And Compare Classification Models. We are going to build and compare a bunch of regression models in this post.

As for the program, we use the same mechanism. A factory class named RegressorFactory takes on tasks such as instantiating a model and fitting the model and predicting and evaluating the model.

Function CompareRegressionModels() takes the responsibility of implementing the work flow.

Along with the score (Coefficient of determination) provided by a model itself, root mean squared error (RMSE) is another indicator chosen to evaluate the models. You can find them on the comparison graph. 

A solid circle represents a model on the chart. The score is displayed near the circle. The RMSEs of the training data and the test data stand for X, Y axis, respectively.


make_regression Dataset


Another example for California Housing dataset.


California Housing



Models to Be Compared

The regression models are also administered in the table ModelList with Category set to Regression, as shown on the screenshot below.

PreCalc indicator is set to 1 if the model requires polynomial calculation, otherwise set to NULL.

The pair of Parameter and Value define the parameters passed to the model in its creation. The table structure allows you to add up to unlimited parameters. What if we specified a duplicate parameter? The first one will be picked out and passed to the model. 

 


Dataset

Dataset should be loaded into table DatasetReg. Please make sure you only add the feature columns and the label column to the table and place the label column in the last. 

The sample is a dataset generated by make_regression() method.

 

make_regression Dataset



Main Flow

The main flow is realized in function CompareRegressionModels(), shown in the diagram below. 

But one thing we need to pay attention to, some models such as Lasso, Polynomial and Ridge require transforming the input data with polynomial matrix before it is fed into the models. So, after standardize the input data, we call PolynomialFeatures() to prepare polynomial calculation matrix, then pass the standardized data to the calculation matrix and get the output. The output will be passed to that group of models. As a result, the flow becomes slightly different.




How to Add a Model?

In case you want to add more models, you can insert the corresponding records into the table ModelList using SQL scripts, or whatever database tool. And please make sure the new model has been included in class RegressorFactory. Otherwise, you will get a warning message saying the model is not implemented as of now.


insert into modellist values('Multiple', 'Regression', null, '', '');
insert into modellist values('Polynomial', 'Regression', 1, '', '');
insert into modellist values('Ridge', 'Regression', 1, 'alpha', '0.1');
insert into modellist values('Ridge', 'Regression', 1, 'random_state', '123');


The newly added models pop up on the graph.



make_regression Dataset


How to Switch the Dataset?

The program fetches data from table DatasetReg, so the data must be moved into DatasetReg. Here is an example for your reference.

- Create an external table named Dataset_Housing.

create table Dataset_Housing (
   MedInc number
  ,HouseAge number
  ,AveRooms number
  ,AveBedrms number
  ,Population number
  ,AveOccup number
  ,Latitude number
  ,Longitude number
  ,Price number
)
organization external
(
  type oracle_loader
  default directory externalfile
  access parameters
  (
    records delimited by newline
    nobadfile
    nologfile
    fields terminated by ','
  )
  location ('cal_housing.csv')
reject limit unlimited
;

- Drop table DatasetReg.

- Create table DatasetReg from Dataset_Housing.




Relook at Mountain Temperature Prediction

GradientBoostingRegressor is used for mountain temperature prediction in the blog Machine Learning - Build A GradientBoostingRegressor Model. In effect it overfit the training dataset. So we run all these regression models on the same dataset this time. 

As you can see from the evaluation graph, DecisionTreeRegressor and Polynomial and RandomForestRegressor and Ridge tend to be overfitting as well for this particular predictive case. 

 




Appendix: Source Code


CompareRegressionModels()

Machine Learning - Build and Compare Classification Models


When it comes to machine learning, there are a wide range of models available in the field. With no doubt, it will be a huge project in terms of efforts and time if someone tries to walk them through. So I've been thinking it may be a good idea if we get them work first with example Python codes. As to concept, underlying math, business scenarios, profits, concerns etc., we can pick up with more researches later when we work on a specific business case. The merit to do this is that we get to know, these models, probably just a little, and we can gain hands-on programming in the first place. It would lay a foundation for further development as needed. 

I happened to read a book, Python Statistics & Machine Learning Mastering Handbook, authored by Team Karupo. The book introduces a bunch of models with concise texts and examples covering supervised and unsupervised and deep learning estimators. 

Inspired by it, I planned to build a series of models at once that can be flexibly selected and visualize their performance on the same graph. Moreover, I would like to approach it from the engineering angle and make the process as simple as we feed the input dataset, then get the visual evaluation results, like the graph shown below. (Please be noted that the process of developing and tuning a model is not the subject we are going to address here. If you are interested in that, please check out Machine Learning - Build A GradientBoostingRegressor Model for more details. )



For each model, we'll do the prediction for both the test data and the training data, and calculate their  accuracy scores as well. Then display them on the chart where a model relates to a solid circle. X axis represents the accuracy score for the training data, whereas Y axis represents the accuracy score for the test data. 

At the same time, use the score() method provided by the model to get the score for this specific case,  texted right above the circle.



Models to Be Compared

Models to be compared are stored in the table ModelList, as shown below. 

Name field bears the name of a model. 

Category field defines which group the model belongs to, either Classification or Regression. 

Parameter and Value fields define parameters to be passed to the model in its creation. Parameter field holds the name of the parameter, and Value field carries the value associated with that parameter. You can add as many parameters as you need for a model. On the other hand, if you don't specify any parameters for a model, the model will take the default parameters.

PreCalc is an indicator showing if we need to carry out extra calculations on the input data before it is passed to a model. For example, we should use a polynomial to transform the data before feed it into Lasso regression model. We'll discuss it more in another blog Machine Learning - Build And Compare Regression Models.




Dataset

The dataset is kept in a table called DatasetCls, which consists of only the feature columns and the label column. Any descriptive columns have to be removed. Additionally, the label column has to appear last. DatasetCls can be either a normal table or an external table.

That is all, we don't have more rules for it. 


Dataset Generated by make_classification()



Main Flow

The main flow, illustrated on the following diagram, is implemented in  function CompareClassificationModels() that you can find in the latter section Appendix: Source Code.

Please be noted that we'll standardize the training data and the test data before feed them into the models.


Main Flow



How Are Models Created?

Model instantiation is implemented in class ClassifierFactory packaged in modelfactory.py. Please refer to Appendix: Source Code section.

The factory method newclassifier() creates and returns a model with specified parameters. The models defined inside the class are all the classifiers that this factory can produce so far. You can add or delete models based on your needs.

The method execute() will call fit() and predict() and score() on the model. Additionally, it will call  accuracy_score() to evaluate the model's performance on the given datasets.



How to Add a Model?

This is pretty straightforward. We can get it done by appending a record to the table ModelList. For example, we would like to add LogisticRegression to the list and specify random_state parameter at the same time. So we can execute the following SQL statement.

insert into modellist values('LogisticRegression', 'Classification', null, 'random_state', '123');



Then we re-run the program, as you can see, LogisticRegression appears on the graph. 




How to Switch the Dataset?

For example, we would like to use the wine dataset coming with sklearn. If we have the csv file on hand, we can create an external table using the SQL script below. If the script doesn't work in your environment, please double check if your csv file has the right encoding. 

create table datasetcls (
  fixed_acidity number,
  volatile_acidity number,
  citric_acid number,
  residual_sugar number,
  chlorides number,
  free_sulfur_dioxide number,
  total_sulfur_dioxide number,
  density number,
  pH number,
  sulphates number,
  alcohol number,
  quality number
)
organization external
(
  type oracle_loader
  default directory externalfile
  access parameters
  (
    records delimited by newline
    nobadfile
    nologfile
    fields terminated by ';'
  )
  location ('winequality-red.csv')
)
reject limit unlimited
;

Wine Dataset

We can do the same for Iris dataset.


Iris Dataset

For the dataset processing, we used the hard-coded parameters. Obviously there is more room for improvements. In response to needs in the field, surely we can add more customized functionalities.

Additionally, the similar work for the regression models will be undertaken and summarized in another blog. 



Reference

Python Statistics & Machine Learning Mastering Handbook  Team Karupo

Choosing the right estimator

Machine Learning - Build A GradientBoostingRegressor Model

Machine Learning - Build And Compare Regression Models



Appendix: Source Code


CompareClassificationModels()

AWS - Build A Serverless Web App

 ‘Run your application without servers’. The idea presented by the cloud service providers is fascinating. Of course, an application runs on...