Skip to content
This repository was archived by the owner on Sep 17, 2025. It is now read-only.

[WIP] Implementation of Python Stats API#149

Merged
liyanhui1228 merged 55 commits intocensus-instrumentation:masterfrom
kcooperstein:stats
May 15, 2018
Merged

[WIP] Implementation of Python Stats API#149
liyanhui1228 merged 55 commits intocensus-instrumentation:masterfrom
kcooperstein:stats

Conversation

@kcooperstein
Copy link
Contributor

No description provided.

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here (e.g. I signed it!) and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

@kcooperstein
Copy link
Contributor Author

I signed it!

@tags07 tags07 reopened this Apr 6, 2018
@kcooperstein
Copy link
Contributor Author

I signed it!

@googlebot
Copy link

CLAs look good, thanks!

@liyanhui1228 liyanhui1228 self-requested a review April 6, 2018 22:51
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add this license to all the other files.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also please use the correct year.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

def __init__(self, boundaries=None, distribution=None, aggregation_type="distribution"):
super().__init__(aggregation_type, boundaries)
self.aggregation_type = aggregation_type
'''self.boundaries = list(bucket_boundaries.BucketBoundaries(boundaries) or [])'''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

self.boundaries = bucket_boundaries.BucketBoundaries(boundaries)
self.distribution = dict(distribution or {})

def get_aggregation(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Python we don't really use getters and setters like in Java, because we can get the variable just by calling self.aggregation_type in this case. Usually when we use getter method, we add a @property decorator before the method signature, and the variable the getter returned is a private variable in this class which has a _ before the variable name. Then with this decorated getter method, calling self.aggregation_type will return the value of the private variable self._aggregation_type in this class.

For more details look at this link: https://www.python-course.eu/python3_properties.php

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

self.resource = client.resource('global', {})

def emit(self, views, datapoint=None):
name = 'projects/{}'.format(self.project_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this line to __init__ so you don't need to construct the name every time you call emit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


def emit(self, views, datapoint=None):
name = 'projects/{}'.format(self.project_id)
'''metric = {}'''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

setup.py Outdated

install_requires = [
'google-cloud-trace>=0.17.0, <0.18dev',
'google-cloud-monitoring>=0.28.1'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

google-cloud-monitoring>=0.28.1, <0.29,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

else:
return "key is not in map"


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean up the extra new lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return False



Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean up extra new lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

metrics[metric_type] = metric_label
labels.append(metric_label)
return metrics

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean up extra new lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


def translate_to_stackdriver(self, views):
metrics = {}
labels = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of labels?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will remove

return 'Successfully wrote time series.'

def set_resource(self, type_, labels):
return self.client.resource(type_, labels)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set 'global' as the default type_ here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from opencensus.trace import tracer as tracer_module
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why changing this file?

@liyanhui1228
Copy link
Contributor

FYI I updated the nox.py config to enable it run on both trace and stats directory, please fix the lint warnings and test coverage.

def value(self):
return self._value

def is_value(self, value):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to check_value?

if key in self.map:
return self.map[key]
else:
return "key is not in map"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to emit a warning message, it should be either in the msg of an exception or a logging msg, but should not be directly returned. Because it will mess up with your intended return type of the tag value. In this case, probably change to something like:

value = self.map.get(key, None)

if value is None:
    raise KeyError('Key is not in map.')

return value

if key in self.map:
return True
else:
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify this with:

return key in self.map

if map is None:
self.map = {}
else:
self.map = map
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.map = map if map is not None else {}

self.map = map
if tags is not None:
self.tags = tags
print(self.tags)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove print statement?

self.tags = tags
print(self.tags)
for tag in self.tags:
print(tag)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

def name(self):
return self._name

def is_valid(self, name):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to check_name?


def get_tag_value(self, key):
if key in self._tags:
return self._tags[key]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return self._tags.get(key, None)

@@ -0,0 +1 @@
/usr/local/Cellar/python/2.7.14_3/Frameworks/Python.framework/Versions/2.7/include/python2.7 No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use this command git rm --cached venv/include/python2.7 to ignore the previously committed file and it will be ignored after that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the venv/ files from the commits.

@@ -0,0 +1 @@
{"last_check":"2018-02-13T00:28:25Z","pypi_version":"9.0.1"} No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still here?

Copy link
Contributor

@wkiser wkiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I think it might make sense to split this into a few smaller PRs since there is a lot going on. Maybe start with the base classes + related tests?

class BaseAggregation(object):

def __init__(self, aggregation_type=None, buckets=None):
if aggregation_type is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aggregation_type can probably be a class level variable since it will be the same for all objects of the type.

else:
self._aggregation_type = "none"

if buckets is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think self._buckets = buckets or [] is more readable, but I'm not sure if that is against this repo's style guide.

def __init__(self, count=None, aggregation_type="count"):
super().__init__(aggregation_type)
self._aggregation_type = aggregation_type
if count is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can make 0 the default value for count in the initializer: count=0,...

self._count_data = count_data

def add_sample(self, value):
self._count_data = self._count_data + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._count_data += 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kcooperstein pls change this.

self._sum_data = sum_data

def add_sample(self, value):
self._sum_data = self._sum_data + value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._sum_data += value

self.client = client
self.project_id = client.project
self.resource = client.resource('global', {})
name = 'projects/{}'.format(self.project_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

self.client.resource = self.set_resource(type_='global', labels= {'project_id': self.project_id})
self.client.write_point(metric, self.client.resource, datapoint, datetime.utcnow() + timedelta(seconds=60), datetime.utcnow())
self.client.time_series(metric, self.client.resource, datapoint, datetime.utcnow() + timedelta(seconds=60), datetime.utcnow())
return 'Successfully wrote time series.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would log this rather than return it.

return self._count_data

class DistributionAggregationData(BaseAggregationData):
def __init__(self, mean_data, count_data, min, max, sum_of_sqd_deviations, counts_per_bucket, bounds):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might want to do min_, max_ since min and max are reserved words

self.client.time_series(metric, self.client.resource, datapoint, datetime.utcnow() + timedelta(seconds=60), datetime.utcnow())
return 'Successfully wrote time series.'

def set_resource(self, type_='global', labels={}):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird python quirk, but it's considered dangerous to use an empty dict as the default value to a function. I would replace with labels=None and create a new empty dict inside the function body is labels is None.

class TagMap(object):

def __init__(self, tags=None, map=None):
self.map = map if map is not None else {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can do self.map = map or {}

@liyanhui1228
Copy link
Contributor

Adding @songy23 for sanity check.

self._registered_views[measure.name] = view
if registered_measure is None:
self._registered_measures[measure.name] = measure
self._map[view.measure.name] = ViewData(view=view, start_time=timestamp, end_time=timestamp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding comment based on our discussion here for future reference: self._map should be implemented as a multi-map instead of a single map, as there could be multiple view data using the same measure. And in this way there will only be one view data because everything before are overwrote.

Copy link
Contributor

@songy23 songy23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass - I haven't looked at the tag part.

return self._count


class MeanAggregation(BaseAggregation):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mean Aggregation should be removed now: census-instrumentation/opencensus-specs#56.

return self._count_data


class MeanAggregationData(BaseAggregationData):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

class BucketBoundaries(object):

def __init__(self, boundaries=None):
self._boundaries = list(boundaries or [])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also check the boundaries are in ascending order.

class MeasureToViewMap(object):

def __init__(self):
self._map = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a multi map (Java counterpart).

self._view = view
self._start_time = start_time
self._end_time = end_time
self._rows = rows if rows is not None else {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW rows and tag_value_aggregation_map are the same thing.

Copy link
Contributor

@liyanhui1228 liyanhui1228 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM excepts the Circle CI needs to be green before we merge. Please fix the lint and coverage tests. And don't forget to remove the unnecessary files like venv/. Use git rm --cached venv/include/python2.7 to remove the committed files from the commits.

Copy link
Contributor

@songy23 songy23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider moving the exporter code to another PR.

:type count_data: int
:param count_data: the count value of the distribution

:type min_: double
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: min_ -> min

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe min is a keyword in python which is why it is not used, would you recommend to use the keyword instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I'm fine with leaving it as-is. I was just wondering why these two were different from other attributes.

:type min_: double
:param min_: the minimum value of the distribution

:type max_: double
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

:rtype: bool
:returns: True if it valid, else returns False
"""
if (len(name) > 0) and (len(name) <= 255):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also check if the given string is a ASCII string. See Go and specs.

"""The current value"""
return self._value

def is_valid_value(self, value):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, you should check if it's an ASCII string.

return self._rows

@property
def tag_value_aggregation_map(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key of this map should be different combinations of TagValues, rather than single TagValue.

def record(self, context, value, timestamp):
"""records the view data against context"""
tag_values = self.get_tag_values(tags=self.get_tag_map(context), columns=self.view.columns)
for tag_value in tag_values:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, you should check and update entry for unique combination of TagValues instead of going through each one.


def get_view(self, view_name, timestamp):
"""get the View Data from the given View name"""
view = self.get_view_data(view_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This view is actually a ViewData?

if view is None:
return None

view_data = ViewData(view=view, start_time=timestamp, end_time=timestamp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC the variable view is a ViewData rather than a View. How do you set the rows/aggregation map for the ViewData?

@@ -0,0 +1 @@
{"traceId": "f8739df974a4481f98748cd92b27177d", "spans": [{"displayName": {"value": "child_span", "truncated_byte_count": 0}, "spanId": "739fba56569f4106", "startTime": "2018-05-07T21:04:16.261253Z", "endTime": "2018-05-07T21:04:18.266410Z", "childSpanCount": 0, "kind": 0, "parentSpanId": "c3e7efa94eb747d3"}, {"displayName": {"value": "root_span", "truncated_byte_count": 0}, "spanId": "c3e7efa94eb747d3", "startTime": "2018-05-07T21:04:14.256056Z", "endTime": "2018-05-07T21:04:18.266438Z", "childSpanCount": 1, "kind": 0, "parentSpanId": "6e0c63257de34c92"}]} No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file accidentally committed? Remove it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this file.

self._count_data = count_data

def add_sample(self, value):
self._count_data = self._count_data + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kcooperstein pls change this.

:param count_data: the count value of the distribution

:type min_: double
:type min_: min
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

min is not a type

:param min_: the minimum value of the distribution

:type max_: double
:type max_: max
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Contributor

@liyanhui1228 liyanhui1228 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the final review comments and we'll be good to merge.





Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two blank lines between functions.

@@ -0,0 +1 @@
{"traceId": "f8739df974a4481f98748cd92b27177d", "spans": [{"displayName": {"value": "child_span", "truncated_byte_count": 0}, "spanId": "739fba56569f4106", "startTime": "2018-05-07T21:04:16.261253Z", "endTime": "2018-05-07T21:04:18.266410Z", "childSpanCount": 0, "kind": 0, "parentSpanId": "c3e7efa94eb747d3"}, {"displayName": {"value": "root_span", "truncated_byte_count": 0}, "spanId": "c3e7efa94eb747d3", "startTime": "2018-05-07T21:04:14.256056Z", "endTime": "2018-05-07T21:04:18.266438Z", "childSpanCount": 1, "kind": 0, "parentSpanId": "6e0c63257de34c92"}]} No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this file.

venv/.Python Outdated
@@ -0,0 +1 @@
/usr/local/Cellar/python/2.7.14_3/Frameworks/Python.framework/Versions/2.7/Python No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still here?

@@ -0,0 +1 @@
{"last_check":"2018-02-13T00:28:25Z","pypi_version":"9.0.1"} No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still here?

@liyanhui1228
Copy link
Contributor

We can ignore the CLA, merging...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants