Jump to content

Machine Learning Tool for Smart Contracts


zak100

Recommended Posts

Hello @zak100, I think a more important step to take before assembling 100,000 instances of data is bench-marking the performance of some baseline or prototype model that you can train on smaller datasets. Once the major limitation becomes how much data you have, then I would consider the problem of getting a ton of data to be your first priority. However, at the moment, I would think it is more crucial for you to read the articles that me and @Ghideon have discussed. With regard to me assisting you through this process, I meant that I can provide a simple model for you in Keras, not a model for SC vulnerability detection, but rather one to just minimally acquaint you with Keras.

Thank you @Ghideon  showing me the way to write code on here.

Here is the general model pipeline in Keras:

1. Create the instance of your model

2. Compile the model

3. Fit your data to the model

4. Evaluate your model's performance

5. Predict new batches of data or datasets using the saved model weights

# Example from some project I did. 
# data = a pandas dataframe with features from a 

import pandas as pd
import tensorflow as tf
import numpy as np
import itertools as it
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import ElasticNet
from sklearn.linear_model import Lasso
from sklearn.linear_model import SGDRegressor
from sklearn.preprocessing import normalize
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from tensorflow import keras
from tensorflow.keras import layers

# the 'target' is what you want to predict
# since you do not have the data I used here, this code will not actually do anything when ran, 
# it purely for illustrative purposes only
target = data.pop('Wattage')
data = MinMaxScaler().fit_transform(data.values).reshape(len(data),5)

# your data would go in the inn
dataset = tf.data.Dataset.from_tensor_slices((data[:,:4], data[:,-1].reshape(-1,1)))

# create the training, testing, and validation datasets
train_size = int(len(data)*0.7)
test_size = int(len(data)*0.15)
train_dataset = dataset.take(train_size)
test_dataset = dataset.skip(train_size)
val_dataset = test_dataset.skip(test_size)
test_dataset = test_dataset.take(test_size)

# THIS IS THE IMPORTANT PART, FOR BUILDING A MODEL
# a keras Sequential model with three dense layers, the last being the output layer
# in this case we put a '1' for the 'units' parameter because we are predicting one target
model = keras.Sequential(
    [
        layers.Dense(units=32, activation='relu', name='layer1'),
        layers.Dense(units=64, activation='relu', name='layer2'),
        layers.Dense(units=1, name='end'),
    ]
)

# compile the model with the correct optimizer, loss, and metrics
model.compile(
    optimizer='adam',
    loss='mse',#tf.keras.losses.MeanSquaredError(reduction="auto", name="mean_squared_error"),
    metrics=['mse']
)

# fit your model to the training dataset and specify the validation dataset
model.fit(
    x=train_dataset,
    epochs=20,
    validation_data=val_dataset,
    verbose=1,
    callbacks=[tf.keras.callbacks.EarlyStopping(patience=5)],
    shuffle=False,
)

# evaluate the model's performance
model.evaluate(
    x=test_dataset,
    verbose=1,
    callbacks=[tf.keras.callbacks.EarlyStopping(patience=5)],
)

# save model for future use, so you do not have to retrain it
model.save(
    filepath='/tmp/trained_on_cleaned_02',
)

 

Link to comment
Share on other sites

Hi @The Mule and @Ghideon-Thanks for discussions and providing me the essential steps for creating a python model.

<bench-marking the performance of some baseline or prototype model that you can train on smaller datasets>

Yes you are right. First I have to come up with some proto-type mode.

<Once the major limitation becomes how much data you have, then I would consider the problem of getting a ton of data to be your first priority. However, at the moment,>

Yes, I would appreciate your help as much as possible.

I got some idea. I hope once I read the paper, I would know more things.

< I would think it is more crucial for you to read the articles that me and @Ghideon have discussed. >

Surely I would look at the articles which you (@The Mule )and @Ghideon have pointed.

God blesses you.

 

Zulfi.

Link to comment
Share on other sites

22 hours ago, The Mule said:

I think a more important step to take before assembling 100,000 instances of data is bench-marking the performance of some baseline or prototype model that you can train on smaller datasets. Once the major limitation becomes how much data you have, then I would consider the problem of getting a ton of data to be your first priority.

For the paper Zak100 provided in opening post my understanding was that they provided experimental evidence that their approach was working and had good performance. Hence I recommended checking that there was possible to get hold of data first. But your contributions allows for a shift of focus as you suggested above; there are now multiple ways to get data and possible alternative approaches. I support your approach to start with prototype model, unless Zak still wants to pursue exactly the approach described in the paper in OP.

22 hours ago, The Mule said:

However, at the moment, I would think it is more crucial for you to read the articles that me and @Ghideon have discussed.

I agree. 

 

22 hours ago, The Mule said:

one to just minimally acquaint you with Keras

@zak100 Here is a quick attempt at providing a little quiz 
(Hope @The Mule corrects me if I get this wrong)

The model in the Keras code example provided by The Mule differs slightly from your initial question and the approach in the paper.  Which Keras class may be a reasonable starting point for the kind of neural network that the paper* in your first post use? 

 

*) The paper in opening post: Towards Safer Smart Contracts: A Sequence Learning Approach to Detecting Security Threats

Edited by Ghideon
Link to comment
Share on other sites

1 hour ago, Ghideon said:

The model in the Keras code example provided by The Mule differs slightly from your initial question and the approach in the paper

In fact, the Python code that I inserted above is quite different from the methods employed in the paper. The code I added was a very rudimentary presentation of what the structure of something written using Keras may appear like. Nonetheless, I agree with @Ghideon that @zak100 should attempt to piece together which major components of Keras correspond to the methods used in the paper. As an example, if the paper mentioned a Convolutional Neural Network, then two appropriate answers would be to look at (1) Google search: keras layers -> https://keras.io/api/layers/ -> look at convolutional layers

(WHAT IT LOOKS LIKE ON THE WEBSITE)

 

38267893_ScreenShot2020-12-17at6_06_09PM.png.58d11cd72240f76aba9158ae082c66a7.png

or (2) keras CNN tutorial -> https://victorzhou.com/blog/keras-cnn-tutorial/

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.