Project Phase I
Due 10/25
Environment Setup:
`
Accept GitHub Classroom invitation by clicking on the link.
Then, you will be able to create teams. Only one person of your team needs to create the team, and the other person can join the team.
After accepting the assignment, you will be directed to a repository initialized by our template code.
All source code files of each phase are under the PathExtractor
directory. You will only need to edit files in this directory to finish the project.
README gives the steps to setup the environment and instructions about how to pull and merge different branches of each project Phase.
We will release a pull request for you to obtain the template at the beginning of each phase.
Here is a step-by-step how to get-started:
- Clone the repositoty to your machine. We recommend a Linux/Unix environment for the project.
For windows users, check WSL.
- Change directory to your local repository.
-
You will need Python in your environment.
-
Create a virtual environment
python -m venv ENV
-
Activate the virtual environment
.\ENV\Scripts\activate
- For macOS and Linux
source ENV/bin/activate
Upgrade pip
python -m pip install --upgrade pip
Install requirements
pip install -r requirements.txt
After setting up, you can simply run a test. It will fail.
cd tests/
pytest .
Structure of First Phase:
There are three files in the source file folder.
-
common.py
:
You do not need to modify this file. It defines some utility functions that can help normalize the output format.
It defines NodeDataManager
structure and two instances of it (node_property_manager
and node_id_manager
)
to maitain our self-defined data extended from tree-sitter Node
.
node_id_manager
maintains the nodes' child_id
relative to their siblings. We will use it to
calculate the Path Width.
node_property_manager
keeps some additional type and name information for nodes. It can help format the output.
You can update and retrive these data using the interfaces provided by the NodeDataManager
.
-
noder_property.py
:
This is the property stored in node_property_manager
's values. You may want to first implement the helper methods other than the initialization method.
It requires you to be familiar with AST structure and the Java grammar of tree-sitter. After that, you will be able to finish the initialization function.
To reduce your workload, the sanitization for formatting node names have mostly been done. You need to handle the abstract_type
property.
For abstract_type
,
- You need to keep an abstract type for unary expression nodes, binary expression nodes, and assignment expreesion nodes.
The abstract type is the concatenation of their original types and the operator.
-
You need to keep an abstract type for nodes of Java primitive types.
For exmample, an Integer
type in Java is treated as an regular Object
type in tree-sitter, but we want to map it to integral_type
(i.e., the type of int
).
You may want to find the BoxedType-to-PrimitiveType mapping relation, and complete the BoxedTypes
dictionary first. (A good way to explore the attributes of tree-sitter is to write a very simple and short traverse function to print the information of interest.
You can also adjust the sample input as needed.)
-
leaves_collector.py
:
This is where you implement a depth-first scan to the AST. You will need to
(1) obtain child id and additional properties and store them to the two data managers,
(2) collect leaf nodes (i.e., terminal leaf nodes associated with their terminal leaf values) to store them in the leaves
array,
which is a member of LeavesCollector
.
AST of Tree-sitter:
Tests:
After finishing, you can run your own tests by python -m PathExtractor.leaves_collector
. In this way, you can print the results to the console to check and debug.
Or you can pytest tests/
. This is similar to our tests used in grading.
Feel free to modify the test in the main function of leaves_collector.py
or the test under ./tests
.
Grading:
You will receive full credit if you pass all of our in-house test cases. Partial credit will be awarded based on the
percentage of test cases passed