Jump to content

Ghideon

Senior Members
  • Posts

    2581
  • Joined

  • Days Won

    21

Everything posted by Ghideon

  1. Thanks, I'll check that later. Here is another one: Is that correct? It seems to imply that if 2>1 then 2<-1 which is not true.
  2. This sounds like a different* approach than the random sentence generation in OP; encoding or embedding a large amount of specific knowledge before running any searches? Here is a paper about Deep Learning for Symbolic Mathematics: https://arxiv.org/abs/1912.01412 Here is an article commenting on the paper: https://www.quantamagazine.org/symbolic-mathematics-finally-yields-to-neural-networks-20200520/ Maybe the above is a simple version of what you are trying to achieve? *) I may have misunderstood something, could simply be a language issue.
  3. Question: do you assume |a1| >=| a2| and |b1| >= |b2|? From your attached paper: Without |a1| >=| a2| and |b1| >= |b2| the square roots may be imaginary. Is that intended? From your attached paper:
  4. Thanks for your reply. I'll wait until you provide some way you attempt to separate the valid data from the huge amount of invalid data. Also note that you keep repeating the same simple examples. An analogy with your rectangle: Assume the program gives "The area of a rectangle with sides 4 meters long is the sum of the length of its sides." 4+4+4+4 gives you 16 which happens to match the numerical value of the area. If we knew enough about area calculation to spot the issue then we had no need for the program? For a more contemporary example, the output could be something like "convolutional neural networks generally provide better results for image recognition than an architecture based on a recurrent neural network". it would be one of millions of similar statements.
  5. For the paper Zak100 provided in opening post my understanding was that they provided experimental evidence that their approach was working and had good performance. Hence I recommended checking that there was possible to get hold of data first. But your contributions allows for a shift of focus as you suggested above; there are now multiple ways to get data and possible alternative approaches. I support your approach to start with prototype model, unless Zak still wants to pursue exactly the approach described in the paper in OP. I agree. @zak100 Here is a quick attempt at providing a little quiz (Hope @The Mule corrects me if I get this wrong) The model in the Keras code example provided by The Mule differs slightly from your initial question and the approach in the paper. Which Keras class may be a reasonable starting point for the kind of neural network that the paper* in your first post use? *) The paper in opening post: Towards Safer Smart Contracts: A Sequence Learning Approach to Detecting Security Threats
  6. You told me to wait: It seems though as my 1st question was answered. What we have looks pretty obvious; scientific papers written in the near future will most likely contain a combination of existing words (together with numbers and mathematical symbols). Even for some novel concept existing words and mathematics could be used to define the new concepts. My 2nd and 3rd question seems to be unanswered.
  7. I find the topic interesting! I'll try to read through the material and see if I can contribute. I must say though that this is outside my area of expertise so I can't guarantee that I'll be successful within a reasonable time. I do not have enough experience to intuitively address your questions. That said, there are plenty of members quite skilled in mathematics. If necessary we may be able to extract some mathematics specific question and post in that section of the forums.
  8. Ok! I was curious if you had some facts to back up the statement or if it was a guess. I do not know, I have not studied this enough to provide an opinion. As for the paper in OP they obtained labels for smart contracts by running them through the Maian tool. To what extend that is tied to number of contracts and what it may imply for other methods I have not investigated.
  9. Note that my comment regarding data was in the context of the paper you have in the opening post. I did a very quick check of the papers provided by @The Mule and I note that they use other approaches which may have impact on data requirements during training. I'll try to get some time to have a better look at the papers. Do you by the way have some argument why one hundred thousand would be enough? Good! I note that they provide a curated dataset with several types of vulnerabilities. That gives Zak100 some alternatives to the data referenced in the paper in OP. Try the button labeled <> in the menu above the post you are creating: Example: <example>XML Code</example> And welcome and thanks for contributing to the discussion @The Mule.
  10. I tried that link and got the same result. So we do not know if we are watching and discussing the original content as it was aired by the TV channel, or a video possibly edited and shared by someone else, correct? The reasons for editing could be quite different if editing was done by the production (as already stated by other members above) or if someone else modified the content originally aired by the tv channel.
  11. I have not yet any opinion about a possible edit of the video material but I am curious how you traced a possible edit or fake to the the producer of the TV-show. The YouTube account linked in OP is not connected to the discovery network as far as I can tell, do you have a link to the original content delivered by the scientific channel?
  12. Some feedback for your idea: 1: Your example requires that the word "area" existed before anyone came up with the idea how to calculate area? Why would that be so, and why would it be so in general for new discoveries? 2: If we assume 1000 words (a small subset of English) and that a sentence of 10 words is sufficient to describe a novel discovery (I guess that is not nearly enough) then you have 2.6E+23 (263409560461970212832400) sentences* to choose from. Sorting out the interesting ones would require quite some effort? 3: Let's try the words in your first reply: Here are three examples using words from your sentence: I understand you very well. You don't understand very well. You don't understand. What algorithm should a computer (or human) use to select the sentence that "makes sense" unless it is already obvious? (I note that OP has hit the max 5 posts for first day.) Side note: Using a more advanced model for generating sentences can get interesting results. The Generative Pre-trained Transformer 3 (GPT-3) is capable of generating human-like text. But I would not expect it to autogenerate papers containing novel scientific discoveries. https://en.wikipedia.org/wiki/GPT-3 *) Algorithms could be used to not generate clearly invalid sentences, the example is just an illustration of the number of combinations.
  13. Because they have syntax errors. Trailing back slash is not allowed* Also note that what ever the arrows are supposed to mean they have no special meaning in regular expressions. The arrows will be matched literally *) In the regex I am familiar with, as described above there are different styles.
  14. As far as I can see the steps described in the paper can be implemented in Python. But to know if you should use python and which parts of the project that may benefit from other tools a more thorough investigation would be required before I could make a statement. It may also depend on how much you intend to implement. Example: I note that data is preprocessed using the MAIAN tool (outlined in the paper) and that MAIAN depends on Solidity Compiler. Those tools may or may not be using python but to what extent that affects you is for you to investigate. How to implement and which libraries to use is another matter depending on details not yet discussed. I could of course provide you with a list of popular python frameworks as of 2020, there are numerous such lists available in any search engine you may want to use. But then you would still be left with sorting out which combination of the frameworks that suits your needs. The task you are asking about going to be solved in a single framework or library. Personally I postpone such decisions (in commercial projects) until later in a project. In other cases specific requirements may tell me what to do, for instance if I am consulting a team that already run everything on .NET I would evaluate Azure products first.
  15. Is this still regarding regular expressions?
  16. That is a wider description than the one in the attached paper. I stress this because any hint I may give regarding an approach depends on what type of problem you try to solve. Are you using opcode sequence as in the paper?
  17. Just so that we do not waste time; you are still looking at the problem in the paper: usage of smart contract opcode sequence as input for our learning a model to detect security threats. That specific problem, not some other variant, Ok?
  18. Yes. But CNN may not be applicable to the problem you have posted.
  19. Convolutional Neural Network. In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of deep neural networks, most commonly applied to analyzing visual imagery. (https://en.wikipedia.org/wiki/Convolutional_neural_network) A glossary with machine learning terms: https://developers.google.com/machine-learning/glossary
  20. No. No. I just stated that with insufficient data the project will fail regardless of NN architecture. I do not how much you have studied the paper you provided and your level of knowledge about data processing so the following example may be obvious: Ok. And the paper states about the contracts: So for a sample of 20-30 contracts there will be on average less than one vulnerable contract. Very simplified we can say that using such a sample, where all objects are of one class (non vulnerable), to train the model you will end up with a model that classifies everything as belonging to that single class (non vulnerable). The above example is simplified and intentionally naive, imbalanced data is common in machine learning and not specific for the data in the paper.
  21. Thanks for your attempt to explain. I believe I have basic knowledge about Newtonian physics and for any improvements needed I'll use other sources than your posts. As for the errors in your physics other members have already provided you with corrections; no need for me to repeat.
  22. Yes. Under the assumption mass is constant then F = d/dt(mv) But that is not what you wrote: mv does not have the unit newton. ma does have the unit newton. They are not the same. What does that mean?
  23. Maybe because your descriptions do not make so much sense to other members trying to understand? I would like to participate in the discussion but I can't follow the descriptions. (This applies also to 3,4,5 and 6 above) That does not sound correct. Mass * velocity, mv is momentum, not force.
  24. Just so that we do not waste time; you are looking at the problem in the paper: usage of smart contract opcode sequence as input for our learning a model to detect security threats. Ok? That specific problem, not some other variant, Ok? Did you look at my follow up to that, in the context of the paper you provided? I do not claim there is no simpler solution to the problem described in the paper. I say that the authors of the paper seem to have come to the conclusion that for the specific problem they worked on they found that simpler* solutions did not do perform well. @PoetheProgrammer's response** is perfectly fine in a more general context and may be the result of a different approach the researchers in the paper used. If you extract other features from the data than they did in the paper, for instance some feature that is not a sequence, then other NN models or architectures may very well be used. Some of those could be simpler but could have other drawbacks. I would not do that. If there is insufficient data to train and test the model then the project is going to fail, no need to waste time. Unless there are some alternative model that would solve the problem with the limited amount of data you have. Why? More hidden layers the what? Simpler than what? More hidden layers than what? *) Again; "simpler" is relative. I use the word as it is used in the paper; a Markov chain or 3-gram model is simpler than a LSTM model. **) From what I've seen so far PoetheProgrammer have more knowledge than me in these topics. If my reply (unintentionally) contradicts them I would give Poethe's response more weight.
  25. What is an "ML network"? Maybe you mean neural network? Some comments regarding your request: 1: Given the conclusions of the paper, what indications do you have that a simple* solution will be effective to solve the problem**? From the paper: 2: Do you have the data required for training the model? As far as I know there exists a number of suitable libraries and frameworks for Python. *) I do not consider the NN architecture in the paper to be "simple" but simple is relative and this may be due to my lack of experience in this specific topic. **) I have to assume the problem is the same as discussed in the paper, not some other undisclosed thing
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.